This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/arrow-site.git
The following commit(s) were added to refs/heads/asf-site by this push: new 15b0a5d89c1 Updating dev docs (build nightly-tests-2025-09-07-0) 15b0a5d89c1 is described below commit 15b0a5d89c1cff784c759504e292fc72f92d82ae Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com> AuthorDate: Mon Sep 8 00:33:59 2025 +0000 Updating dev docs (build nightly-tests-2025-09-07-0) --- ...id-8ff51316a5bfe716c8346df112ea33beaa5228f4.svg | 2 +- docs/dev/python/data.html | 46 +++---- docs/dev/python/dataset.html | 136 ++++++++++----------- docs/dev/python/getstarted.html | 2 +- docs/dev/python/memory.html | 6 +- docs/dev/python/pandas.html | 6 +- docs/dev/python/parquet.html | 12 +- docs/dev/r/articles/data_wrangling.html | 24 ++-- docs/dev/r/news/index.html | 86 ++++++------- docs/dev/r/pkgdown.yml | 2 +- docs/dev/r/reference/to_arrow.html | 6 +- docs/dev/r/reference/to_duckdb.html | 10 +- docs/dev/r/search.json | 2 +- docs/dev/searchindex.js | 2 +- 14 files changed, 171 insertions(+), 171 deletions(-) diff --git a/docs/dev/_images/mermaid-8ff51316a5bfe716c8346df112ea33beaa5228f4.svg b/docs/dev/_images/mermaid-8ff51316a5bfe716c8346df112ea33beaa5228f4.svg index 30fda8c9364..e0241a60bd5 100644 --- a/docs/dev/_images/mermaid-8ff51316a5bfe716c8346df112ea33beaa5228f4.svg +++ b/docs/dev/_images/mermaid-8ff51316a5bfe716c8346df112ea33beaa5228f4.svg @@ -1 +1 @@ -<svg aria-roledescription="flowchart-v2" role="graphics-document document" viewBox="0 0 1645.734375 786.1171875" style="max-width: 1645.73px; background-color: white;" class="flowchart" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.w3.org/2000/svg" width="100%" id="my-svg"><style>#my-svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#my-svg .e [...] \ No newline at end of file +<svg aria-roledescription="flowchart-v2" role="graphics-document document" viewBox="0 0 1645.734375 786.1171875" style="max-width: 1645.73px; background-color: white;" class="flowchart" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.w3.org/2000/svg" width="100%" id="my-svg"><style>#my-svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#my-svg .e [...] \ No newline at end of file diff --git a/docs/dev/python/data.html b/docs/dev/python/data.html index f8a61593f47..12295cffcd6 100644 --- a/docs/dev/python/data.html +++ b/docs/dev/python/data.html @@ -1684,7 +1684,7 @@ for you:</p> <span class="gp">In [29]: </span><span class="n">arr</span> <span class="gh">Out[29]: </span> -<span class="go"><pyarrow.lib.Int64Array object at 0x7fb7decca380></span> +<span class="go"><pyarrow.lib.Int64Array object at 0x7f30fd7e0a60></span> <span class="go">[</span> <span class="go"> 1,</span> <span class="go"> 2,</span> @@ -1696,7 +1696,7 @@ for you:</p> <p>But you may also pass a specific data type to override type inference:</p> <div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [30]: </span><span class="n">pa</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="nb">type</span><span class="o">=</span><span class="n">pa</span><span class="o">.</span><span class="n">uint16</span><span class="p">())</span> <span class="gh">Out[30]: </span> -<span class="go"><pyarrow.lib.UInt16Array object at 0x7fb7decca740></span> +<span class="go"><pyarrow.lib.UInt16Array object at 0x7f30fd7e0fa0></span> <span class="go">[</span> <span class="go"> 1,</span> <span class="go"> 2</span> @@ -1731,7 +1731,7 @@ nulls:</p> <p>Arrays can be sliced without copying:</p> <div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [36]: </span><span class="n">arr</span><span class="p">[</span><span class="mi">1</span><span class="p">:</span><span class="mi">3</span><span class="p">]</span> <span class="gh">Out[36]: </span> -<span class="go"><pyarrow.lib.Int64Array object at 0x7fb7deccb280></span> +<span class="go"><pyarrow.lib.Int64Array object at 0x7f30fd7bb5e0></span> <span class="go">[</span> <span class="go"> 2,</span> <span class="go"> null</span> @@ -1784,7 +1784,7 @@ This allows for ListView arrays to specify out-of-order offsets:</p> <span class="gp">In [45]: </span><span class="n">arr</span> <span class="gh">Out[45]: </span> -<span class="go"><pyarrow.lib.ListViewArray object at 0x7fb7deb40040></span> +<span class="go"><pyarrow.lib.ListViewArray object at 0x7f30fd7e22c0></span> <span class="go">[</span> <span class="go"> [</span> <span class="go"> 5,</span> @@ -1809,7 +1809,7 @@ This allows for ListView arrays to specify out-of-order offsets:</p> dictionaries:</p> <div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [46]: </span><span class="n">pa</span><span class="o">.</span><span class="n">array</span><span class="p">([{</span><span class="s1">'x'</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s1">'y'</span><span class="p">:</span> <span class="kc">True</span><span class="p">},</span> <span class="p">{</span><span class="s1">'z& [...] <span class="gh">Out[46]: </span> -<span class="go"><pyarrow.lib.StructArray object at 0x7fb7deb40400></span> +<span class="go"><pyarrow.lib.StructArray object at 0x7f30fd7e2740></span> <span class="go">-- is_valid: all not null</span> <span class="go">-- child 0 type: int64</span> <span class="go"> [</span> @@ -1836,7 +1836,7 @@ you must explicitly pass the type:</p> <span class="gp">In [48]: </span><span class="n">pa</span><span class="o">.</span><span class="n">array</span><span class="p">([{</span><span class="s1">'x'</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s1">'y'</span><span class="p">:</span> <span class="kc">True</span><span class="p">},</span> <span class="p">{</span><span class="s1">'x'</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span [...] <span class="gh">Out[48]: </span> -<span class="go"><pyarrow.lib.StructArray object at 0x7fb7deb40a00></span> +<span class="go"><pyarrow.lib.StructArray object at 0x7f30fd7e2d40></span> <span class="go">-- is_valid: all not null</span> <span class="go">-- child 0 type: int8</span> <span class="go"> [</span> @@ -1851,7 +1851,7 @@ you must explicitly pass the type:</p> <span class="gp">In [49]: </span><span class="n">pa</span><span class="o">.</span><span class="n">array</span><span class="p">([(</span><span class="mi">3</span><span class="p">,</span> <span class="kc">True</span><span class="p">),</span> <span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="kc">False</span><span class="p">)],</span> <span class="nb">type</span><span class="o">=</span><span class="n">ty</span><span class="p">)</span> <span class="gh">Out[49]: </span> -<span class="go"><pyarrow.lib.StructArray object at 0x7fb7deb40ac0></span> +<span class="go"><pyarrow.lib.StructArray object at 0x7f30fd7e2e60></span> <span class="go">-- is_valid: all not null</span> <span class="go">-- child 0 type: int8</span> <span class="go"> [</span> @@ -1870,7 +1870,7 @@ level and at the individual field level. If initializing from a sequence of Python dicts, a missing dict key is handled as a null value:</p> <div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [50]: </span><span class="n">pa</span><span class="o">.</span><span class="n">array</span><span class="p">([{</span><span class="s1">'x'</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span> <span class="kc">None</span><span class="p">,</span> <span class="p">{</span><span class="s1">'y'</span><span class="p">:</span> <span class="kc">None</s [...] <span class="gh">Out[50]: </span> -<span class="go"><pyarrow.lib.StructArray object at 0x7fb7deb40b20></span> +<span class="go"><pyarrow.lib.StructArray object at 0x7f30fd7e2ec0></span> <span class="go">-- is_valid:</span> <span class="go"> [</span> <span class="go"> true,</span> @@ -1905,7 +1905,7 @@ individual arrays, and no copy is involved:</p> <span class="gp">In [55]: </span><span class="n">arr</span> <span class="gh">Out[55]: </span> -<span class="go"><pyarrow.lib.StructArray object at 0x7fb7deccae60></span> +<span class="go"><pyarrow.lib.StructArray object at 0x7f30fd7e3760></span> <span class="go">-- is_valid: all not null</span> <span class="go">-- child 0 type: int16</span> <span class="go"> [</span> @@ -1932,7 +1932,7 @@ the type is explicitly passed into <a class="reference internal" href="generated <span class="gp">In [58]: </span><span class="n">pa</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="n">ty</span><span class="p">)</span> <span class="gh">Out[58]: </span> -<span class="go"><pyarrow.lib.MapArray object at 0x7fb7deb41660></span> +<span class="go"><pyarrow.lib.MapArray object at 0x7f30fd7e3be0></span> <span class="go">[</span> <span class="go"> keys:</span> <span class="go"> [</span> @@ -1966,7 +1966,7 @@ their row, use the <a class="reference internal" href="generated/pyarrow.ListArr <span class="gp">In [60]: </span><span class="n">arr</span><span class="o">.</span><span class="n">keys</span> <span class="gh">Out[60]: </span> -<span class="go"><pyarrow.lib.StringArray object at 0x7fb7deb41900></span> +<span class="go"><pyarrow.lib.StringArray object at 0x7f30fd7e3e80></span> <span class="go">[</span> <span class="go"> "x",</span> <span class="go"> "y",</span> @@ -1975,7 +1975,7 @@ their row, use the <a class="reference internal" href="generated/pyarrow.ListArr <span class="gp">In [61]: </span><span class="n">arr</span><span class="o">.</span><span class="n">items</span> <span class="gh">Out[61]: </span> -<span class="go"><pyarrow.lib.Int64Array object at 0x7fb7deb414e0></span> +<span class="go"><pyarrow.lib.Int64Array object at 0x7f30fd7e2c80></span> <span class="go">[</span> <span class="go"> 4,</span> <span class="go"> 5,</span> @@ -1984,7 +1984,7 @@ their row, use the <a class="reference internal" href="generated/pyarrow.ListArr <span class="gp">In [62]: </span><span class="n">pa</span><span class="o">.</span><span class="n">ListArray</span><span class="o">.</span><span class="n">from_arrays</span><span class="p">(</span><span class="n">arr</span><span class="o">.</span><span class="n">offsets</span><span class="p">,</span> <span class="n">arr</span><span class="o">.</span><span class="n">keys</span><span class="p">)</span> <span class="gh">Out[62]: </span> -<span class="go"><pyarrow.lib.ListArray object at 0x7fb7deb41b40></span> +<span class="go"><pyarrow.lib.ListArray object at 0x7f30fd600100></span> <span class="go">[</span> <span class="go"> [</span> <span class="go"> "x",</span> @@ -1997,7 +1997,7 @@ their row, use the <a class="reference internal" href="generated/pyarrow.ListArr <span class="gp">In [63]: </span><span class="n">pa</span><span class="o">.</span><span class="n">ListArray</span><span class="o">.</span><span class="n">from_arrays</span><span class="p">(</span><span class="n">arr</span><span class="o">.</span><span class="n">offsets</span><span class="p">,</span> <span class="n">arr</span><span class="o">.</span><span class="n">items</span><span class="p">)</span> <span class="gh">Out[63]: </span> -<span class="go"><pyarrow.lib.ListArray object at 0x7fb7deb41ae0></span> +<span class="go"><pyarrow.lib.ListArray object at 0x7f30fd6000a0></span> <span class="go">[</span> <span class="go"> [</span> <span class="go"> 4,</span> @@ -2032,7 +2032,7 @@ selected:</p> <span class="gp">In [69]: </span><span class="n">union_arr</span> <span class="gh">Out[69]: </span> -<span class="go"><pyarrow.lib.UnionArray object at 0x7fb7deb41f60></span> +<span class="go"><pyarrow.lib.UnionArray object at 0x7f30fd7bbe80></span> <span class="go">-- is_valid: all not null</span> <span class="go">-- type_ids: [</span> <span class="go"> 0,</span> @@ -2071,7 +2071,7 @@ each offset in the selected child array it can be found:</p> <span class="gp">In [76]: </span><span class="n">union_arr</span> <span class="gh">Out[76]: </span> -<span class="go"><pyarrow.lib.UnionArray object at 0x7fb7deb42980></span> +<span class="go"><pyarrow.lib.UnionArray object at 0x7f30fd6007c0></span> <span class="go">-- is_valid: all not null</span> <span class="go">-- type_ids: [</span> <span class="go"> 0,</span> @@ -2120,7 +2120,7 @@ consider an example:</p> <span class="gp">In [80]: </span><span class="n">dict_array</span> <span class="gh">Out[80]: </span> -<span class="go"><pyarrow.lib.DictionaryArray object at 0x7fb7e4432810></span> +<span class="go"><pyarrow.lib.DictionaryArray object at 0x7f3100842030></span> <span class="go">-- dictionary:</span> <span class="go"> [</span> @@ -2147,7 +2147,7 @@ consider an example:</p> <span class="gp">In [82]: </span><span class="n">dict_array</span><span class="o">.</span><span class="n">indices</span> <span class="gh">Out[82]: </span> -<span class="go"><pyarrow.lib.Int64Array object at 0x7fb7deb43100></span> +<span class="go"><pyarrow.lib.Int64Array object at 0x7f30fd6011e0></span> <span class="go">[</span> <span class="go"> 0,</span> <span class="go"> 1,</span> @@ -2161,7 +2161,7 @@ consider an example:</p> <span class="gp">In [83]: </span><span class="n">dict_array</span><span class="o">.</span><span class="n">dictionary</span> <span class="gh">Out[83]: </span> -<span class="go"><pyarrow.lib.StringArray object at 0x7fb7deb42bc0></span> +<span class="go"><pyarrow.lib.StringArray object at 0x7f30fd601360></span> <span class="go">[</span> <span class="go"> "foo",</span> <span class="go"> "bar",</span> @@ -2217,7 +2217,7 @@ instances. Let’s consider a collection of arrays:</p> <span class="gp">In [90]: </span><span class="n">batch</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="gh">Out[90]: </span> -<span class="go"><pyarrow.lib.StringArray object at 0x7fb7deccb5e0></span> +<span class="go"><pyarrow.lib.StringArray object at 0x7f30fd601ba0></span> <span class="go">[</span> <span class="go"> "foo",</span> <span class="go"> "bar",</span> @@ -2231,7 +2231,7 @@ instances. Let’s consider a collection of arrays:</p> <span class="gp">In [92]: </span><span class="n">batch2</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="gh">Out[92]: </span> -<span class="go"><pyarrow.lib.StringArray object at 0x7fb7deb41240></span> +<span class="go"><pyarrow.lib.StringArray object at 0x7f30fd602200></span> <span class="go">[</span> <span class="go"> "bar",</span> <span class="go"> "baz",</span> @@ -2275,7 +2275,7 @@ container for one or more arrays of the same type.</p> <span class="gp">In [98]: </span><span class="n">c</span> <span class="gh">Out[98]: </span> -<span class="go"><pyarrow.lib.ChunkedArray object at 0x7fb7deb78400></span> +<span class="go"><pyarrow.lib.ChunkedArray object at 0x7f30fd602740></span> <span class="go">[</span> <span class="go"> [</span> <span class="go"> 1,</span> @@ -2309,7 +2309,7 @@ container for one or more arrays of the same type.</p> <span class="gp">In [100]: </span><span class="n">c</span><span class="o">.</span><span class="n">chunk</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="gh">Out[100]: </span> -<span class="go"><pyarrow.lib.Int64Array object at 0x7fb7deb788e0></span> +<span class="go"><pyarrow.lib.Int64Array object at 0x7f30fd602440></span> <span class="go">[</span> <span class="go"> 1,</span> <span class="go"> 2,</span> diff --git a/docs/dev/python/dataset.html b/docs/dev/python/dataset.html index aea3a35306b..118dea7095c 100644 --- a/docs/dev/python/dataset.html +++ b/docs/dev/python/dataset.html @@ -1570,7 +1570,7 @@ can pass it the path to the directory containing the data files:</p> <span class="gp">In [12]: </span><span class="n">dataset</span> <span class="o">=</span> <span class="n">ds</span><span class="o">.</span><span class="n">dataset</span><span class="p">(</span><span class="n">base</span> <span class="o">/</span> <span class="s2">"parquet_dataset"</span><span class="p">,</span> <span class="nb">format</span><span class="o">=</span><span class="s2">"parquet"</span><span class="p">)</span> <span class="gp">In [13]: </span><span class="n">dataset</span> -<span class="gh">Out[13]: </span><span class="go"><pyarrow._dataset.FileSystemDataset at 0x7fb7decc8b80></span> +<span class="gh">Out[13]: </span><span class="go"><pyarrow._dataset.FileSystemDataset at 0x7f30fd6490c0></span> </pre></div> </div> <p>In addition to searching a base directory, <a class="reference internal" href="generated/pyarrow.dataset.dataset.html#pyarrow.dataset.dataset" title="pyarrow.dataset.dataset"><code class="xref py py-func docutils literal notranslate"><span class="pre">dataset()</span></code></a> accepts a path to a @@ -1579,8 +1579,8 @@ single file or a list of file paths.</p> needed, it only crawls the directory to find all the files:</p> <div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [14]: </span><span class="n">dataset</span><span class="o">.</span><span class="n">files</span> <span class="gh">Out[14]: </span> -<span class="go">['/tmp/pyarrow-usuae05q/parquet_dataset/data1.parquet',</span> -<span class="go"> '/tmp/pyarrow-usuae05q/parquet_dataset/data2.parquet']</span> +<span class="go">['/tmp/pyarrow-8uhfu8uh/parquet_dataset/data1.parquet',</span> +<span class="go"> '/tmp/pyarrow-8uhfu8uh/parquet_dataset/data2.parquet']</span> </pre></div> </div> <p>… and infers the dataset’s schema (by default from the first file):</p> @@ -1601,23 +1601,23 @@ this can require a lot of memory, see below on filtering / iterative loading):</ <span class="go">c: int64</span> <span class="gt">----</span> <span class="ne">a</span>: [[0,1,2,3,4],[5,6,7,8,9]] -<span class="ne">b</span>: [[-0.46528353665756017,-2.1646697161723525,-0.9725688395877113,-0.6151625333912072,-0.009834580397950961],[0.19813176146070893,-2.564821452813575,-0.17453274029554844,-0.8237788847376623,-0.47286990999678424]] +<span class="ne">b</span>: [[-1.6122948857555386,-0.8684479878691105,-0.8059951258244353,-0.07524959837131494,-0.5041024791644488],[-1.036094073575387,0.18109790208384172,1.4040614638673674,0.27403914176679883,0.30585196341271126]] <span class="ne">c</span>: [[1,2,1,2,1],[2,1,2,1,2]] <span class="c1"># converting to pandas to see the contents of the scanned table</span> <span class="gp">In [17]: </span><span class="n">dataset</span><span class="o">.</span><span class="n">to_table</span><span class="p">()</span><span class="o">.</span><span class="n">to_pandas</span><span class="p">()</span> <span class="gh">Out[17]: </span> <span class="go"> a b c</span> -<span class="go">0 0 -0.465284 1</span> -<span class="go">1 1 -2.164670 2</span> -<span class="go">2 2 -0.972569 1</span> -<span class="go">3 3 -0.615163 2</span> -<span class="go">4 4 -0.009835 1</span> -<span class="go">5 5 0.198132 2</span> -<span class="go">6 6 -2.564821 1</span> -<span class="go">7 7 -0.174533 2</span> -<span class="go">8 8 -0.823779 1</span> -<span class="go">9 9 -0.472870 2</span> +<span class="go">0 0 -1.612295 1</span> +<span class="go">1 1 -0.868448 2</span> +<span class="go">2 2 -0.805995 1</span> +<span class="go">3 3 -0.075250 2</span> +<span class="go">4 4 -0.504102 1</span> +<span class="go">5 5 -1.036094 2</span> +<span class="go">6 6 0.181098 1</span> +<span class="go">7 7 1.404061 2</span> +<span class="go">8 8 0.274039 1</span> +<span class="go">9 9 0.305852 2</span> </pre></div> </div> </section> @@ -1640,11 +1640,11 @@ supported; more formats are planned in the future.</p> <span class="gp">In [21]: </span><span class="n">dataset</span><span class="o">.</span><span class="n">to_table</span><span class="p">()</span><span class="o">.</span><span class="n">to_pandas</span><span class="p">()</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> <span class="gh">Out[21]: </span> <span class="go"> a b c</span> -<span class="go">0 0 -0.465284 1</span> -<span class="go">1 1 -2.164670 2</span> -<span class="go">2 2 -0.972569 1</span> -<span class="go">3 3 -0.615163 2</span> -<span class="go">4 4 -0.009835 1</span> +<span class="go">0 0 -1.612295 1</span> +<span class="go">1 1 -0.868448 2</span> +<span class="go">2 2 -0.805995 1</span> +<span class="go">3 3 -0.075250 2</span> +<span class="go">4 4 -0.504102 1</span> </pre></div> </div> </section> @@ -1676,16 +1676,16 @@ supported; more formats are planned in the future.</p> <span class="gp">In [23]: </span><span class="n">dataset</span><span class="o">.</span><span class="n">to_table</span><span class="p">(</span><span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="s1">'a'</span><span class="p">,</span> <span class="s1">'b'</span><span class="p">])</span><span class="o">.</span><span class="n">to_pandas</span><span class="p">()</span> <span class="gh">Out[23]: </span> <span class="go"> a b</span> -<span class="go">0 0 -0.465284</span> -<span class="go">1 1 -2.164670</span> -<span class="go">2 2 -0.972569</span> -<span class="go">3 3 -0.615163</span> -<span class="go">4 4 -0.009835</span> -<span class="go">5 5 0.198132</span> -<span class="go">6 6 -2.564821</span> -<span class="go">7 7 -0.174533</span> -<span class="go">8 8 -0.823779</span> -<span class="go">9 9 -0.472870</span> +<span class="go">0 0 -1.612295</span> +<span class="go">1 1 -0.868448</span> +<span class="go">2 2 -0.805995</span> +<span class="go">3 3 -0.075250</span> +<span class="go">4 4 -0.504102</span> +<span class="go">5 5 -1.036094</span> +<span class="go">6 6 0.181098</span> +<span class="go">7 7 1.404061</span> +<span class="go">8 8 0.274039</span> +<span class="go">9 9 0.305852</span> </pre></div> </div> <p>With the <code class="docutils literal notranslate"><span class="pre">filter</span></code> keyword, rows which do not match the filter predicate will @@ -1694,18 +1694,18 @@ not be included in the returned table. The keyword expects a boolean <div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [24]: </span><span class="n">dataset</span><span class="o">.</span><span class="n">to_table</span><span class="p">(</span><span class="nb">filter</span><span class="o">=</span><span class="n">ds</span><span class="o">.</span><span class="n">field</span><span class="p">(</span><span class="s1">'a'</span><span class="p">)</span> <span class="o">>=</span> <span class="mi">7</sp [...] <span class="gh">Out[24]: </span> <span class="go"> a b c</span> -<span class="go">0 7 -0.174533 2</span> -<span class="go">1 8 -0.823779 1</span> -<span class="go">2 9 -0.472870 2</span> +<span class="go">0 7 1.404061 2</span> +<span class="go">1 8 0.274039 1</span> +<span class="go">2 9 0.305852 2</span> <span class="gp">In [25]: </span><span class="n">dataset</span><span class="o">.</span><span class="n">to_table</span><span class="p">(</span><span class="nb">filter</span><span class="o">=</span><span class="n">ds</span><span class="o">.</span><span class="n">field</span><span class="p">(</span><span class="s1">'c'</span><span class="p">)</span> <span class="o">==</span> <span class="mi">2</span><span class="p">)</span><span class="o">.</span><span class="n">to_pandas</span><spa [...] <span class="gh">Out[25]: </span> <span class="go"> a b c</span> -<span class="go">0 1 -2.164670 2</span> -<span class="go">1 3 -0.615163 2</span> -<span class="go">2 5 0.198132 2</span> -<span class="go">3 7 -0.174533 2</span> -<span class="go">4 9 -0.472870 2</span> +<span class="go">0 1 -0.868448 2</span> +<span class="go">1 3 -0.075250 2</span> +<span class="go">2 5 -1.036094 2</span> +<span class="go">3 7 1.404061 2</span> +<span class="go">4 9 0.305852 2</span> </pre></div> </div> <p>The easiest way to construct those <a class="reference internal" href="generated/pyarrow.dataset.Expression.html#pyarrow.dataset.Expression" title="pyarrow.dataset.Expression"><code class="xref py py-class docutils literal notranslate"><span class="pre">Expression</span></code></a> objects is by using the @@ -1750,11 +1750,11 @@ values:</p> <span class="gp">In [30]: </span><span class="n">dataset</span><span class="o">.</span><span class="n">to_table</span><span class="p">(</span><span class="n">columns</span><span class="o">=</span><span class="n">projection</span><span class="p">)</span><span class="o">.</span><span class="n">to_pandas</span><span class="p">()</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> <span class="gh">Out[30]: </span> <span class="go"> a_renamed b_as_float32 c_1</span> -<span class="go">0 0 -0.465284 True</span> -<span class="go">1 1 -2.164670 False</span> -<span class="go">2 2 -0.972569 True</span> -<span class="go">3 3 -0.615163 False</span> -<span class="go">4 4 -0.009835 True</span> +<span class="go">0 0 -1.612295 True</span> +<span class="go">1 1 -0.868448 False</span> +<span class="go">2 2 -0.805995 True</span> +<span class="go">3 3 -0.075250 False</span> +<span class="go">4 4 -0.504102 True</span> </pre></div> </div> <p>The dictionary also determines the column selection (only the keys in the @@ -1768,11 +1768,11 @@ build up the dictionary from the dataset schema:</p> <span class="gp">In [33]: </span><span class="n">dataset</span><span class="o">.</span><span class="n">to_table</span><span class="p">(</span><span class="n">columns</span><span class="o">=</span><span class="n">projection</span><span class="p">)</span><span class="o">.</span><span class="n">to_pandas</span><span class="p">()</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> <span class="gh">Out[33]: </span> <span class="go"> a b c b_large</span> -<span class="go">0 0 -0.465284 1 False</span> -<span class="go">1 1 -2.164670 2 False</span> -<span class="go">2 2 -0.972569 1 False</span> -<span class="go">3 3 -0.615163 2 False</span> -<span class="go">4 4 -0.009835 1 False</span> +<span class="go">0 0 -1.612295 1 False</span> +<span class="go">1 1 -0.868448 2 False</span> +<span class="go">2 2 -0.805995 1 False</span> +<span class="go">3 3 -0.075250 2 False</span> +<span class="go">4 4 -0.504102 1 False</span> </pre></div> </div> </section> @@ -1825,8 +1825,8 @@ should use a hive-like partitioning scheme with the <code class="docutils litera <span class="gp">In [37]: </span><span class="n">dataset</span><span class="o">.</span><span class="n">files</span> <span class="gh">Out[37]: </span> -<span class="go">['parquet_dataset_partitioned/part=a/527c25b984184fbb86115d8966c86d0c-0.parquet',</span> -<span class="go"> 'parquet_dataset_partitioned/part=b/527c25b984184fbb86115d8966c86d0c-0.parquet']</span> +<span class="go">['parquet_dataset_partitioned/part=a/a2a9b276f54e4e9bb15822f9ce842cb0-0.parquet',</span> +<span class="go"> 'parquet_dataset_partitioned/part=b/a2a9b276f54e4e9bb15822f9ce842cb0-0.parquet']</span> </pre></div> </div> <p>Although the partition fields are not included in the actual Parquet files, @@ -1834,9 +1834,9 @@ they will be added back to the resulting table when scanning this dataset:</p> <div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [38]: </span><span class="n">dataset</span><span class="o">.</span><span class="n">to_table</span><span class="p">()</span><span class="o">.</span><span class="n">to_pandas</span><span class="p">()</span><span class="o">.</span><span class="n">head</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span> <span class="gh">Out[38]: </span> <span class="go"> a b c part</span> -<span class="go">0 0 -0.798984 1 a</span> -<span class="go">1 1 -1.168478 2 a</span> -<span class="go">2 2 -2.194629 1 a</span> +<span class="go">0 0 0.018885 1 a</span> +<span class="go">1 1 -0.572126 2 a</span> +<span class="go">2 2 1.120338 1 a</span> </pre></div> </div> <p>We can now filter on the partition keys, which avoids loading files @@ -1844,11 +1844,11 @@ altogether if they do not match the filter:</p> <div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [39]: </span><span class="n">dataset</span><span class="o">.</span><span class="n">to_table</span><span class="p">(</span><span class="nb">filter</span><span class="o">=</span><span class="n">ds</span><span class="o">.</span><span class="n">field</span><span class="p">(</span><span class="s2">"part"</span><span class="p">)</span> <span class="o">==</span> <span class="s2">&qu [...] <span class="gh">Out[39]: </span> <span class="go"> a b c part</span> -<span class="go">0 5 1.143522 2 b</span> -<span class="go">1 6 -1.530894 1 b</span> -<span class="go">2 7 0.936422 2 b</span> -<span class="go">3 8 -0.086070 1 b</span> -<span class="go">4 9 -2.060658 2 b</span> +<span class="go">0 5 1.324098 2 b</span> +<span class="go">1 6 -0.055986 1 b</span> +<span class="go">2 7 -0.831589 2 b</span> +<span class="go">3 8 -0.263288 1 b</span> +<span class="go">4 9 -0.214678 2 b</span> </pre></div> </div> <section id="different-partitioning-schemes"> @@ -1980,19 +1980,19 @@ is materialized as columns when reading the data and can be used for filtering:< <div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [47]: </span><span class="n">dataset</span><span class="o">.</span><span class="n">to_table</span><span class="p">()</span><span class="o">.</span><span class="n">to_pandas</span><span class="p">()</span> <span class="gh">Out[47]: </span> <span class="go"> year col1 col2</span> -<span class="go">0 2018 0 1.515929</span> -<span class="go">1 2018 1 -1.262641</span> -<span class="go">2 2018 2 -0.521871</span> -<span class="go">3 2019 0 1.515929</span> -<span class="go">4 2019 1 -1.262641</span> -<span class="go">5 2019 2 -0.521871</span> +<span class="go">0 2018 0 -0.540980</span> +<span class="go">1 2018 1 0.898279</span> +<span class="go">2 2018 2 0.043922</span> +<span class="go">3 2019 0 -0.540980</span> +<span class="go">4 2019 1 0.898279</span> +<span class="go">5 2019 2 0.043922</span> <span class="gp">In [48]: </span><span class="n">dataset</span><span class="o">.</span><span class="n">to_table</span><span class="p">(</span><span class="nb">filter</span><span class="o">=</span><span class="n">ds</span><span class="o">.</span><span class="n">field</span><span class="p">(</span><span class="s1">'year'</span><span class="p">)</span> <span class="o">==</span> <span class="mi">2019</span><span class="p">)</span><span class="o">.</span><span class="n">to_pandas</spa [...] <span class="gh">Out[48]: </span> <span class="go"> year col1 col2</span> -<span class="go">0 2019 0 1.515929</span> -<span class="go">1 2019 1 -1.262641</span> -<span class="go">2 2019 2 -0.521871</span> +<span class="go">0 2019 0 -0.540980</span> +<span class="go">1 2019 1 0.898279</span> +<span class="go">2 2019 2 0.043922</span> </pre></div> </div> <p>Another benefit of manually listing the files is that the order of the files @@ -2244,7 +2244,7 @@ to supply a visitor that will be called as each file is created:</p> <span class="gp"> ....: </span> <span class="go">path=dataset_visited/c=1/part-0.parquet</span> <span class="go">size=824 bytes</span> -<span class="go">metadata=<pyarrow._parquet.FileMetaData object at 0x7fb7d9a90400></span> +<span class="go">metadata=<pyarrow._parquet.FileMetaData object at 0x7f30fd6a4e50></span> <span class="go"> created_by: parquet-cpp-arrow version 22.0.0-SNAPSHOT</span> <span class="go"> num_columns: 2</span> <span class="go"> num_rows: 5</span> @@ -2253,7 +2253,7 @@ to supply a visitor that will be called as each file is created:</p> <span class="go"> serialized_size: 0</span> <span class="go">path=dataset_visited/c=2/part-0.parquet</span> <span class="go">size=826 bytes</span> -<span class="go">metadata=<pyarrow._parquet.FileMetaData object at 0x7fb7debef880></span> +<span class="go">metadata=<pyarrow._parquet.FileMetaData object at 0x7f30fd7f8400></span> <span class="go"> created_by: parquet-cpp-arrow version 22.0.0-SNAPSHOT</span> <span class="go"> num_columns: 2</span> <span class="go"> num_rows: 5</span> diff --git a/docs/dev/python/getstarted.html b/docs/dev/python/getstarted.html index 1507140f362..0190c9b5694 100644 --- a/docs/dev/python/getstarted.html +++ b/docs/dev/python/getstarted.html @@ -1595,7 +1595,7 @@ it’s possible to apply transformations to the data</p> <span class="gp">In [12]: </span><span class="n">pc</span><span class="o">.</span><span class="n">value_counts</span><span class="p">(</span><span class="n">birthdays_table</span><span class="p">[</span><span class="s2">"years"</span><span class="p">])</span> <span class="gh">Out[12]: </span> -<span class="go"><pyarrow.lib.StructArray object at 0x7fb7ae7a9360></span> +<span class="go"><pyarrow.lib.StructArray object at 0x7f30cd2dd6c0></span> <span class="go">-- is_valid: all not null</span> <span class="go">-- child 0 type: int16</span> <span class="go"> [</span> diff --git a/docs/dev/python/memory.html b/docs/dev/python/memory.html index bf2d21d5976..3c05ccae224 100644 --- a/docs/dev/python/memory.html +++ b/docs/dev/python/memory.html @@ -1544,7 +1544,7 @@ a bytes object:</p> <span class="gp">In [3]: </span><span class="n">buf</span> <span class="o">=</span> <span class="n">pa</span><span class="o">.</span><span class="n">py_buffer</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="gp">In [4]: </span><span class="n">buf</span> -<span class="gh">Out[4]: </span><span class="go"><pyarrow.Buffer address=0x7fb7ad08aed0 size=26 is_cpu=True is_mutable=False></span> +<span class="gh">Out[4]: </span><span class="go"><pyarrow.Buffer address=0x7f30cbb30790 size=26 is_cpu=True is_mutable=False></span> <span class="gp">In [5]: </span><span class="n">buf</span><span class="o">.</span><span class="n">size</span> <span class="gh">Out[5]: </span><span class="go">26</span> @@ -1557,7 +1557,7 @@ referenced using the <a class="reference internal" href="generated/pyarrow.forei <p>Buffers can be used in circumstances where a Python buffer or memoryview is required, and such conversions are zero-copy:</p> <div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [6]: </span><span class="nb">memoryview</span><span class="p">(</span><span class="n">buf</span><span class="p">)</span> -<span class="gh">Out[6]: </span><span class="go"><memory at 0x7fb7ae6f2800></span> +<span class="gh">Out[6]: </span><span class="go"><memory at 0x7f30cc02e800></span> </pre></div> </div> <p>The Buffer’s <a class="reference internal" href="generated/pyarrow.Buffer.html#pyarrow.Buffer.to_pybytes" title="pyarrow.Buffer.to_pybytes"><code class="xref py py-meth docutils literal notranslate"><span class="pre">to_pybytes()</span></code></a> method converts the Buffer’s data to a @@ -1756,7 +1756,7 @@ into Arrow Buffer objects, use <code class="docutils literal notranslate"><span <span class="gp">In [32]: </span><span class="n">buf</span> <span class="o">=</span> <span class="n">mmap</span><span class="o">.</span><span class="n">read_buffer</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span> <span class="gp">In [33]: </span><span class="nb">print</span><span class="p">(</span><span class="n">buf</span><span class="p">)</span> -<span class="go"><pyarrow.Buffer address=0x7fb8499d8000 size=4 is_cpu=True is_mutable=False></span> +<span class="go"><pyarrow.Buffer address=0x7f3168661000 size=4 is_cpu=True is_mutable=False></span> <span class="gp">In [34]: </span><span class="n">buf</span><span class="o">.</span><span class="n">to_pybytes</span><span class="p">()</span> <span class="gh">Out[34]: </span><span class="go">b'some'</span> diff --git a/docs/dev/python/pandas.html b/docs/dev/python/pandas.html index b72ad4da33d..139ea3ac8e7 100644 --- a/docs/dev/python/pandas.html +++ b/docs/dev/python/pandas.html @@ -1718,7 +1718,7 @@ same categories of the Pandas DataFrame.</p> <span class="gp">In [10]: </span><span class="n">chunk</span><span class="o">.</span><span class="n">dictionary</span> <span class="gh">Out[10]: </span> -<span class="go"><pyarrow.lib.StringArray object at 0x7fb7acff8940></span> +<span class="go"><pyarrow.lib.StringArray object at 0x7f30fd67a740></span> <span class="go">[</span> <span class="go"> "a",</span> <span class="go"> "b",</span> @@ -1727,7 +1727,7 @@ same categories of the Pandas DataFrame.</p> <span class="gp">In [11]: </span><span class="n">chunk</span><span class="o">.</span><span class="n">indices</span> <span class="gh">Out[11]: </span> -<span class="go"><pyarrow.lib.Int8Array object at 0x7fb7acff8880></span> +<span class="go"><pyarrow.lib.Int8Array object at 0x7f30cd2dc340></span> <span class="go">[</span> <span class="go"> 0,</span> <span class="go"> 1,</span> @@ -1853,7 +1853,7 @@ converted to an Arrow <code class="docutils literal notranslate"><span class="pr <span class="gp">In [33]: </span><span class="n">arr</span> <span class="gh">Out[33]: </span> -<span class="go"><pyarrow.lib.Time64Array object at 0x7fb7acffa500></span> +<span class="go"><pyarrow.lib.Time64Array object at 0x7f314bb22860></span> <span class="go">[</span> <span class="go"> 01:01:01.000000,</span> <span class="go"> 02:02:02.000000</span> diff --git a/docs/dev/python/parquet.html b/docs/dev/python/parquet.html index 1bdf4b3c748..63033136b34 100644 --- a/docs/dev/python/parquet.html +++ b/docs/dev/python/parquet.html @@ -1689,7 +1689,7 @@ you may choose to omit it by passing <code class="docutils literal notranslate"> <span class="gp">In [20]: </span><span class="n">parquet_file</span><span class="o">.</span><span class="n">metadata</span> <span class="gh">Out[20]: </span> -<span class="go"><pyarrow._parquet.FileMetaData object at 0x7fb8393b1120></span> +<span class="go"><pyarrow._parquet.FileMetaData object at 0x7f314bbc6c00></span> <span class="go"> created_by: parquet-cpp-arrow version 22.0.0-SNAPSHOT</span> <span class="go"> num_columns: 4</span> <span class="go"> num_rows: 3</span> @@ -1699,7 +1699,7 @@ you may choose to omit it by passing <code class="docutils literal notranslate"> <span class="gp">In [21]: </span><span class="n">parquet_file</span><span class="o">.</span><span class="n">schema</span> <span class="gh">Out[21]: </span> -<span class="go"><pyarrow._parquet.ParquetSchema object at 0x7fb8393bbfc0></span> +<span class="go"><pyarrow._parquet.ParquetSchema object at 0x7f30c9476e80></span> <span class="go">required group field_id=-1 schema {</span> <span class="go"> optional double field_id=-1 one;</span> <span class="go"> optional binary field_id=-1 two (String);</span> @@ -1757,7 +1757,7 @@ concatenate them into a single table. You can read individual row groups with <span class="gp">In [30]: </span><span class="n">metadata</span> <span class="gh">Out[30]: </span> -<span class="go"><pyarrow._parquet.FileMetaData object at 0x7fb7aea80d10></span> +<span class="go"><pyarrow._parquet.FileMetaData object at 0x7f30cd860d10></span> <span class="go"> created_by: parquet-cpp-arrow version 22.0.0-SNAPSHOT</span> <span class="go"> num_columns: 4</span> <span class="go"> num_rows: 3</span> @@ -1771,7 +1771,7 @@ concatenate them into a single table. You can read individual row groups with such as the row groups and column chunk metadata and statistics:</p> <div class="highlight-ipython notranslate"><div class="highlight"><pre><span></span><span class="gp">In [31]: </span><span class="n">metadata</span><span class="o">.</span><span class="n">row_group</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="gh">Out[31]: </span> -<span class="go"><pyarrow._parquet.RowGroupMetaData object at 0x7fb8393b2b10></span> +<span class="go"><pyarrow._parquet.RowGroupMetaData object at 0x7f30c93babb0></span> <span class="go"> num_columns: 4</span> <span class="go"> num_rows: 3</span> <span class="go"> total_byte_size: 290</span> @@ -1779,7 +1779,7 @@ such as the row groups and column chunk metadata and statistics:</p> <span class="gp">In [32]: </span><span class="n">metadata</span><span class="o">.</span><span class="n">row_group</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">column</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="gh">Out[32]: </span> -<span class="go"><pyarrow._parquet.ColumnChunkMetaData object at 0x7fb8393b2d40></span> +<span class="go"><pyarrow._parquet.ColumnChunkMetaData object at 0x7f314bbf5a80></span> <span class="go"> file_offset: 0</span> <span class="go"> file_path: </span> <span class="go"> physical_type: DOUBLE</span> @@ -1787,7 +1787,7 @@ such as the row groups and column chunk metadata and statistics:</p> <span class="go"> path_in_schema: one</span> <span class="go"> is_stats_set: True</span> <span class="go"> statistics:</span> -<span class="go"> <pyarrow._parquet.Statistics object at 0x7fb8393b2d90></span> +<span class="go"> <pyarrow._parquet.Statistics object at 0x7f314bbf5b20></span> <span class="go"> has_min_max: True</span> <span class="go"> min: -1.0</span> <span class="go"> max: 2.5</span> diff --git a/docs/dev/r/articles/data_wrangling.html b/docs/dev/r/articles/data_wrangling.html index ea0dad1e948..3e8cc89dd45 100644 --- a/docs/dev/r/articles/data_wrangling.html +++ b/docs/dev/r/articles/data_wrangling.html @@ -409,18 +409,18 @@ paying a performance penalty using the helper function <span> <span class="co"># perform other arrow operations...</span></span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/compute.html" class="external-link">collect</a></span><span class="op">(</span><span class="op">)</span></span></code></pre></div> <pre><code><span><span class="co">## <span style="color: #949494;"># A tibble: 28 x 4</span></span></span> -<span><span class="co">## name height mass hair_color</span></span> -<span><span class="co">## <span style="color: #949494; font-style: italic;"><chr></span> <span style="color: #949494; font-style: italic;"><int></span> <span style="color: #949494; font-style: italic;"><dbl></span> <span style="color: #949494; font-style: italic;"><chr></span> </span></span> -<span><span class="co">## <span style="color: #BCBCBC;"> 1</span> <span style="color: #949494;">"</span>Yoda<span style="color: #949494;">"</span> 66 17 white </span></span> -<span><span class="co">## <span style="color: #BCBCBC;"> 2</span> <span style="color: #949494;">"</span>Leia Organa<span style="color: #949494;">"</span> 150 49 brown </span></span> -<span><span class="co">## <span style="color: #BCBCBC;"> 3</span> <span style="color: #949494;">"</span>Beru Whitesun Lars<span style="color: #949494;">"</span> 165 75 brown </span></span> -<span><span class="co">## <span style="color: #BCBCBC;"> 4</span> <span style="color: #949494;">"</span>Wedge Antilles<span style="color: #949494;">"</span> 170 77 brown </span></span> -<span><span class="co">## <span style="color: #BCBCBC;"> 5</span> <span style="color: #949494;">"</span>Wicket Systri Warrick<span style="color: #949494;">"</span> 88 20 brown </span></span> -<span><span class="co">## <span style="color: #BCBCBC;"> 6</span> <span style="color: #949494;">"</span>Cord\u00e9<span style="color: #949494;">"</span> 157 <span style="color: #BB0000;">NA</span> brown </span></span> -<span><span class="co">## <span style="color: #BCBCBC;"> 7</span> <span style="color: #949494;">"</span>Dorm\u00e9<span style="color: #949494;">"</span> 165 <span style="color: #BB0000;">NA</span> brown </span></span> -<span><span class="co">## <span style="color: #BCBCBC;"> 8</span> <span style="color: #949494;">"</span>R4-P17<span style="color: #949494;">"</span> 96 <span style="color: #BB0000;">NA</span> none </span></span> -<span><span class="co">## <span style="color: #BCBCBC;"> 9</span> <span style="color: #949494;">"</span>Lobot<span style="color: #949494;">"</span> 175 79 none </span></span> -<span><span class="co">## <span style="color: #BCBCBC;">10</span> <span style="color: #949494;">"</span>Ackbar<span style="color: #949494;">"</span> 180 83 none </span></span> +<span><span class="co">## name height mass hair_color</span></span> +<span><span class="co">## <span style="color: #949494; font-style: italic;"><chr></span> <span style="color: #949494; font-style: italic;"><int></span> <span style="color: #949494; font-style: italic;"><dbl></span> <span style="color: #949494; font-style: italic;"><chr></span> </span></span> +<span><span class="co">## <span style="color: #BCBCBC;"> 1</span> R4-P17 96 <span style="color: #BB0000;">NA</span> none </span></span> +<span><span class="co">## <span style="color: #BCBCBC;"> 2</span> Lobot 175 79 none </span></span> +<span><span class="co">## <span style="color: #BCBCBC;"> 3</span> Ackbar 180 83 none </span></span> +<span><span class="co">## <span style="color: #BCBCBC;"> 4</span> Nien Nunb 160 68 none </span></span> +<span><span class="co">## <span style="color: #BCBCBC;"> 5</span> Sebulba 112 40 none </span></span> +<span><span class="co">## <span style="color: #BCBCBC;"> 6</span> Bib Fortuna 180 <span style="color: #BB0000;">NA</span> none </span></span> +<span><span class="co">## <span style="color: #BCBCBC;"> 7</span> Ayla Secura 178 55 none </span></span> +<span><span class="co">## <span style="color: #BCBCBC;"> 8</span> Ratts Tyerel 79 15 none </span></span> +<span><span class="co">## <span style="color: #BCBCBC;"> 9</span> Dud Bolt 94 45 none </span></span> +<span><span class="co">## <span style="color: #BCBCBC;">10</span> Gasgano 122 <span style="color: #BB0000;">NA</span> none </span></span> <span><span class="co">## <span style="color: #949494;"># i 18 more rows</span></span></span></code></pre> </div> <div class="section level2"> diff --git a/docs/dev/r/news/index.html b/docs/dev/r/news/index.html index cd23871b853..646b19683d3 100644 --- a/docs/dev/r/news/index.html +++ b/docs/dev/r/news/index.html @@ -76,14 +76,14 @@ <h2 class="pkg-version" data-toc-text="21.0.0.9000" id="arrow-21009000">arrow 21.0.0.9000<a class="anchor" aria-label="anchor" href="#arrow-21009000"></a></h2> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="21.0.0.1" id="arrow-21001">arrow 21.0.0.1<a class="anchor" aria-label="anchor" href="#arrow-21001"></a></h2><p class="text-muted">CRAN release: 2025-08-18</p> +<h2 class="pkg-version" data-toc-text="21.0.0.1" id="arrow-21001">arrow 21.0.0.1<a class="anchor" aria-label="anchor" href="#arrow-21001"></a></h2> <div class="section level3"> <h3 id="minor-improvements-and-fixes-21-0-0-1">Minor improvements and fixes<a class="anchor" aria-label="anchor" href="#minor-improvements-and-fixes-21-0-0-1"></a></h3> <ul><li>Patch bundled version of Thrift to prevent CRAN check failures (<a href="https://github.com/kou" class="external-link">@kou</a>, <a href="https://github.com/apache/arrow/issues/47286" class="external-link">#47286</a>)</li> </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="21.0.0" id="arrow-2100">arrow 21.0.0<a class="anchor" aria-label="anchor" href="#arrow-2100"></a></h2><p class="text-muted">CRAN release: 2025-07-24</p> +<h2 class="pkg-version" data-toc-text="21.0.0" id="arrow-2100">arrow 21.0.0<a class="anchor" aria-label="anchor" href="#arrow-2100"></a></h2> <div class="section level3"> <h3 id="new-features-21-0-0">New features<a class="anchor" aria-label="anchor" href="#new-features-21-0-0"></a></h3> <ul><li>Support for Arrow’s 32 and 64 bit Decimal types (<a href="https://github.com/apache/arrow/issues/46720" class="external-link">#46720</a>).</li> @@ -103,21 +103,21 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="20.0.0.2" id="arrow-20002">arrow 20.0.0.2<a class="anchor" aria-label="anchor" href="#arrow-20002"></a></h2><p class="text-muted">CRAN release: 2025-05-26</p> +<h2 class="pkg-version" data-toc-text="20.0.0.2" id="arrow-20002">arrow 20.0.0.2<a class="anchor" aria-label="anchor" href="#arrow-20002"></a></h2> <div class="section level3"> <h3 id="minor-improvements-and-fixes-20-0-0-2">Minor improvements and fixes<a class="anchor" aria-label="anchor" href="#minor-improvements-and-fixes-20-0-0-2"></a></h3> <ul><li>Updated internal C++ code to comply with CRAN’s gcc-UBSAN checks (<a href="https://github.com/apache/arrow/issues/46394" class="external-link">#46394</a>)</li> </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="20.0.0" id="arrow-2000">arrow 20.0.0<a class="anchor" aria-label="anchor" href="#arrow-2000"></a></h2><p class="text-muted">CRAN release: 2025-05-10</p> +<h2 class="pkg-version" data-toc-text="20.0.0" id="arrow-2000">arrow 20.0.0<a class="anchor" aria-label="anchor" href="#arrow-2000"></a></h2> <div class="section level3"> <h3 id="minor-improvements-and-fixes-20-0-0">Minor improvements and fixes<a class="anchor" aria-label="anchor" href="#minor-improvements-and-fixes-20-0-0"></a></h3> <ul><li>Binary Arrays now inherit from <code><a href="https://blob.tidyverse.org/reference/blob.html" class="external-link">blob::blob</a></code> in addition to <code>arrow_binary</code> when <a href="https://arrow.apache.org/docs/r/articles/data_types.html#translations-from-arrow-to-r">converted to R objects</a>. This change is the first step in eventually deprecating the <code>arrow_binary</code> class in favor of the <code>blob</code> class in the <a href="https://cran.r-project.org/p [...] </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="19.0.1.1" id="arrow-19011">arrow 19.0.1.1<a class="anchor" aria-label="anchor" href="#arrow-19011"></a></h2><p class="text-muted">CRAN release: 2025-04-08</p> +<h2 class="pkg-version" data-toc-text="19.0.1.1" id="arrow-19011">arrow 19.0.1.1<a class="anchor" aria-label="anchor" href="#arrow-19011"></a></h2> <div class="section level3"> <h3 id="minor-improvements-and-fixes-19-0-1-1">Minor improvements and fixes<a class="anchor" aria-label="anchor" href="#minor-improvements-and-fixes-19-0-1-1"></a></h3> <ul><li>Updated internal code to comply with new CRAN requirements on non-API calls (<a href="https://github.com/apache/arrow/issues/45949" class="external-link">#45949</a>)</li> @@ -125,11 +125,11 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="19.0.1" id="arrow-1901">arrow 19.0.1<a class="anchor" aria-label="anchor" href="#arrow-1901"></a></h2><p class="text-muted">CRAN release: 2025-02-26</p> +<h2 class="pkg-version" data-toc-text="19.0.1" id="arrow-1901">arrow 19.0.1<a class="anchor" aria-label="anchor" href="#arrow-1901"></a></h2> <p>This release primarily updates the underlying Arrow C++ version used by the package to version 19.0.1 and includes all changes from the 19.0.0 and 19.0.1 releases. For what’s changed in Arrow C++ 19.0.0, please see the <a href="https://arrow.apache.org/blog/2025/01/16/19.0.0-release/" class="external-link">blog post</a> and <a href="https://arrow.apache.org/release/19.0.0.html#changelog" class="external-link">changelog</a>. For what’s changed in Arrow C++ 19.0.1, please see the <a hre [...] </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="18.1.0" id="arrow-1810">arrow 18.1.0<a class="anchor" aria-label="anchor" href="#arrow-1810"></a></h2><p class="text-muted">CRAN release: 2024-12-05</p> +<h2 class="pkg-version" data-toc-text="18.1.0" id="arrow-1810">arrow 18.1.0<a class="anchor" aria-label="anchor" href="#arrow-1810"></a></h2> <div class="section level3"> <h3 id="minor-improvements-and-fixes-18-1-0">Minor improvements and fixes<a class="anchor" aria-label="anchor" href="#minor-improvements-and-fixes-18-1-0"></a></h3> <ul><li>Fix bindings to allow filtering a factor column in a Dataset using <code>%in%</code> (<a href="https://github.com/apache/arrow/issues/43446" class="external-link">#43446</a>)</li> @@ -141,7 +141,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="17.0.0" id="arrow-1700">arrow 17.0.0<a class="anchor" aria-label="anchor" href="#arrow-1700"></a></h2><p class="text-muted">CRAN release: 2024-08-17</p> +<h2 class="pkg-version" data-toc-text="17.0.0" id="arrow-1700">arrow 17.0.0<a class="anchor" aria-label="anchor" href="#arrow-1700"></a></h2> <div class="section level3"> <h3 id="new-features-17-0-0">New features<a class="anchor" aria-label="anchor" href="#new-features-17-0-0"></a></h3> <ul><li>R functions that users write that use functions that Arrow supports in dataset queries now can be used in queries too. Previously, only functions that used arithmetic operators worked. For example, <code>time_hours <- function(mins) mins / 60</code> worked, but <code>time_hours_rounded <- function(mins) round(mins / 60)</code> did not; now both work. These are automatic translations rather than true user-defined functions (UDFs); for UDFs, see <code><a href="../reference/re [...] @@ -161,7 +161,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="16.1.0" id="arrow-1610">arrow 16.1.0<a class="anchor" aria-label="anchor" href="#arrow-1610"></a></h2><p class="text-muted">CRAN release: 2024-05-25</p> +<h2 class="pkg-version" data-toc-text="16.1.0" id="arrow-1610">arrow 16.1.0<a class="anchor" aria-label="anchor" href="#arrow-1610"></a></h2> <div class="section level3"> <h3 id="new-features-16-1-0">New features<a class="anchor" aria-label="anchor" href="#new-features-16-1-0"></a></h3> <ul><li>Streams can now be written to socket connections (<a href="https://github.com/apache/arrow/issues/38897" class="external-link">#38897</a>)</li> @@ -176,7 +176,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="15.0.1" id="arrow-1501">arrow 15.0.1<a class="anchor" aria-label="anchor" href="#arrow-1501"></a></h2><p class="text-muted">CRAN release: 2024-03-12</p> +<h2 class="pkg-version" data-toc-text="15.0.1" id="arrow-1501">arrow 15.0.1<a class="anchor" aria-label="anchor" href="#arrow-1501"></a></h2> <div class="section level3"> <h3 id="new-features-15-0-1">New features<a class="anchor" aria-label="anchor" href="#new-features-15-0-1"></a></h3> <ul><li>Bindings for <code><a href="https://rdrr.io/r/base/prod.html" class="external-link">base::prod</a></code> have been added so you can now use it in your dplyr pipelines (i.e., <code>tbl |> summarize(prod(col))</code>) without having to pull the data into R (<a href="https://github.com/m-muecke" class="external-link">@m-muecke</a>, <a href="https://github.com/apache/arrow/issues/38601" class="external-link">#38601</a>).</li> @@ -196,7 +196,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="14.0.2.1" id="arrow-14021">arrow 14.0.2.1<a class="anchor" aria-label="anchor" href="#arrow-14021"></a></h2><p class="text-muted">CRAN release: 2024-02-23</p> +<h2 class="pkg-version" data-toc-text="14.0.2.1" id="arrow-14021">arrow 14.0.2.1<a class="anchor" aria-label="anchor" href="#arrow-14021"></a></h2> <div class="section level3"> <h3 id="minor-improvements-and-fixes-14-0-2-1">Minor improvements and fixes<a class="anchor" aria-label="anchor" href="#minor-improvements-and-fixes-14-0-2-1"></a></h3> <ul><li>Check for internet access when building from source and fallback to a minimally scoped Arrow C++ build (<a href="https://github.com/apache/arrow/issues/39699" class="external-link">#39699</a>).</li> @@ -215,7 +215,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="14.0.0.2" id="arrow-14002">arrow 14.0.0.2<a class="anchor" aria-label="anchor" href="#arrow-14002"></a></h2><p class="text-muted">CRAN release: 2023-12-02</p> +<h2 class="pkg-version" data-toc-text="14.0.0.2" id="arrow-14002">arrow 14.0.0.2<a class="anchor" aria-label="anchor" href="#arrow-14002"></a></h2> <div class="section level3"> <h3 id="minor-improvements-and-fixes-14-0-0-2">Minor improvements and fixes<a class="anchor" aria-label="anchor" href="#minor-improvements-and-fixes-14-0-0-2"></a></h3> <ul><li>Fixed the printf syntax to align with format checking (<a href="https://github.com/apache/arrow/issues/38894" class="external-link">#38894</a>)</li> @@ -232,7 +232,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="14.0.0.1" id="arrow-14001">arrow 14.0.0.1<a class="anchor" aria-label="anchor" href="#arrow-14001"></a></h2><p class="text-muted">CRAN release: 2023-11-24</p> +<h2 class="pkg-version" data-toc-text="14.0.0.1" id="arrow-14001">arrow 14.0.0.1<a class="anchor" aria-label="anchor" href="#arrow-14001"></a></h2> <div class="section level3"> <h3 id="minor-improvements-and-fixes-14-0-0-1">Minor improvements and fixes<a class="anchor" aria-label="anchor" href="#minor-improvements-and-fixes-14-0-0-1"></a></h3> <ul><li>Add more debug output for build failures (<a href="https://github.com/apache/arrow/issues/38819" class="external-link">#38819</a>)</li> @@ -241,7 +241,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="14.0.0" id="arrow-1400">arrow 14.0.0<a class="anchor" aria-label="anchor" href="#arrow-1400"></a></h2><p class="text-muted">CRAN release: 2023-11-16</p> +<h2 class="pkg-version" data-toc-text="14.0.0" id="arrow-1400">arrow 14.0.0<a class="anchor" aria-label="anchor" href="#arrow-1400"></a></h2> <div class="section level3"> <h3 id="new-features-14-0-0">New features<a class="anchor" aria-label="anchor" href="#new-features-14-0-0"></a></h3> <ul><li>When reading partitioned CSV datasets and supplying a schema to <code><a href="../reference/open_dataset.html">open_dataset()</a></code>, the partition variables are now included in the resulting dataset (<a href="https://github.com/apache/arrow/issues/37658" class="external-link">#37658</a>).</li> @@ -272,11 +272,11 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="13.0.0.1" id="arrow-13001">arrow 13.0.0.1<a class="anchor" aria-label="anchor" href="#arrow-13001"></a></h2><p class="text-muted">CRAN release: 2023-09-22</p> +<h2 class="pkg-version" data-toc-text="13.0.0.1" id="arrow-13001">arrow 13.0.0.1<a class="anchor" aria-label="anchor" href="#arrow-13001"></a></h2> <ul><li>Remove reference to legacy timezones to prevent CRAN check failures (<a href="https://github.com/apache/arrow/issues/37671" class="external-link">#37671</a>)</li> </ul></div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="13.0.0" id="arrow-1300">arrow 13.0.0<a class="anchor" aria-label="anchor" href="#arrow-1300"></a></h2><p class="text-muted">CRAN release: 2023-08-30</p> +<h2 class="pkg-version" data-toc-text="13.0.0" id="arrow-1300">arrow 13.0.0<a class="anchor" aria-label="anchor" href="#arrow-1300"></a></h2> <div class="section level3"> <h3 id="breaking-changes-13-0-0">Breaking changes<a class="anchor" aria-label="anchor" href="#breaking-changes-13-0-0"></a></h3> <ul><li>Input objects which inherit only from <code>data.frame</code> and no other classes now have the <code>class</code> attribute dropped, resulting in now always returning tibbles from file reading functions and <code><a href="../reference/table.html">arrow_table()</a></code>, which results in consistency in the type of returned objects. Calling <code><a href="https://rdrr.io/r/base/as.data.frame.html" class="external-link">as.data.frame()</a></code> on Arrow Tabular objects now alwa [...] @@ -316,16 +316,16 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="12.0.1.1" id="arrow-12011">arrow 12.0.1.1<a class="anchor" aria-label="anchor" href="#arrow-12011"></a></h2><p class="text-muted">CRAN release: 2023-07-18</p> +<h2 class="pkg-version" data-toc-text="12.0.1.1" id="arrow-12011">arrow 12.0.1.1<a class="anchor" aria-label="anchor" href="#arrow-12011"></a></h2> <ul><li>Update a package version reference to be text only instead of numeric due to CRAN update requiring this (<a href="https://github.com/apache/arrow/issues/36353" class="external-link">#36353</a>, <a href="https://github.com/apache/arrow/issues/36364" class="external-link">#36364</a>)</li> </ul></div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="12.0.1" id="arrow-1201">arrow 12.0.1<a class="anchor" aria-label="anchor" href="#arrow-1201"></a></h2><p class="text-muted">CRAN release: 2023-06-15</p> +<h2 class="pkg-version" data-toc-text="12.0.1" id="arrow-1201">arrow 12.0.1<a class="anchor" aria-label="anchor" href="#arrow-1201"></a></h2> <ul><li>Update the version of the date library vendored with Arrow C++ library for compatibility with tzdb 0.4.0 (<a href="https://github.com/apache/arrow/issues/35594" class="external-link">#35594</a>, <a href="https://github.com/apache/arrow/issues/35612" class="external-link">#35612</a>).</li> <li>Update some tests for compatibility with waldo 0.5.1 (<a href="https://github.com/apache/arrow/issues/35131" class="external-link">#35131</a>, <a href="https://github.com/apache/arrow/issues/35308" class="external-link">#35308</a>).</li> </ul></div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="12.0.0" id="arrow-1200">arrow 12.0.0<a class="anchor" aria-label="anchor" href="#arrow-1200"></a></h2><p class="text-muted">CRAN release: 2023-05-05</p> +<h2 class="pkg-version" data-toc-text="12.0.0" id="arrow-1200">arrow 12.0.0<a class="anchor" aria-label="anchor" href="#arrow-1200"></a></h2> <div class="section level3"> <h3 id="new-features-12-0-0">New features<a class="anchor" aria-label="anchor" href="#new-features-12-0-0"></a></h3> <ul><li>The <code><a href="../reference/read_parquet.html">read_parquet()</a></code> and <code><a href="../reference/read_feather.html">read_feather()</a></code> functions can now accept URL arguments (<a href="https://github.com/apache/arrow/issues/33287" class="external-link">#33287</a>, <a href="https://github.com/apache/arrow/issues/34708" class="external-link">#34708</a>).</li> @@ -357,7 +357,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="11.0.0.3" id="arrow-11003">arrow 11.0.0.3<a class="anchor" aria-label="anchor" href="#arrow-11003"></a></h2><p class="text-muted">CRAN release: 2023-03-08</p> +<h2 class="pkg-version" data-toc-text="11.0.0.3" id="arrow-11003">arrow 11.0.0.3<a class="anchor" aria-label="anchor" href="#arrow-11003"></a></h2> <div class="section level3"> <h3 id="minor-improvements-and-fixes-11-0-0-3">Minor improvements and fixes<a class="anchor" aria-label="anchor" href="#minor-improvements-and-fixes-11-0-0-3"></a></h3> <ul><li> @@ -366,7 +366,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="11.0.0.2" id="arrow-11002">arrow 11.0.0.2<a class="anchor" aria-label="anchor" href="#arrow-11002"></a></h2><p class="text-muted">CRAN release: 2023-02-12</p> +<h2 class="pkg-version" data-toc-text="11.0.0.2" id="arrow-11002">arrow 11.0.0.2<a class="anchor" aria-label="anchor" href="#arrow-11002"></a></h2> <div class="section level3"> <h3 id="breaking-changes-11-0-0-2">Breaking changes<a class="anchor" aria-label="anchor" href="#breaking-changes-11-0-0-2"></a></h3> <ul><li> @@ -432,14 +432,14 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="10.0.1" id="arrow-1001">arrow 10.0.1<a class="anchor" aria-label="anchor" href="#arrow-1001"></a></h2><p class="text-muted">CRAN release: 2022-12-06</p> +<h2 class="pkg-version" data-toc-text="10.0.1" id="arrow-1001">arrow 10.0.1<a class="anchor" aria-label="anchor" href="#arrow-1001"></a></h2> <p>Minor improvements and fixes:</p> <ul><li>Fixes for failing test after lubridate 1.9 release (<a href="https://github.com/apache/arrow/issues/14615" class="external-link">#14615</a>)</li> <li>Update to ensure compatibility with changes in dev purrr (<a href="https://github.com/apache/arrow/issues/14581" class="external-link">#14581</a>)</li> <li>Fix to correctly handle <code>.data</code> pronoun in <code><a href="https://dplyr.tidyverse.org/reference/group_by.html" class="external-link">dplyr::group_by()</a></code> (<a href="https://github.com/apache/arrow/issues/14484" class="external-link">#14484</a>)</li> </ul></div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="10.0.0" id="arrow-1000">arrow 10.0.0<a class="anchor" aria-label="anchor" href="#arrow-1000"></a></h2><p class="text-muted">CRAN release: 2022-10-26</p> +<h2 class="pkg-version" data-toc-text="10.0.0" id="arrow-1000">arrow 10.0.0<a class="anchor" aria-label="anchor" href="#arrow-1000"></a></h2> <div class="section level3"> <h3 id="arrow-dplyr-queries-10-0-0">Arrow dplyr queries<a class="anchor" aria-label="anchor" href="#arrow-dplyr-queries-10-0-0"></a></h3> <p>Several new functions can be used in queries:</p> @@ -475,7 +475,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="9.0.0" id="arrow-900">arrow 9.0.0<a class="anchor" aria-label="anchor" href="#arrow-900"></a></h2><p class="text-muted">CRAN release: 2022-08-10</p> +<h2 class="pkg-version" data-toc-text="9.0.0" id="arrow-900">arrow 9.0.0<a class="anchor" aria-label="anchor" href="#arrow-900"></a></h2> <div class="section level3"> <h3 id="arrow-dplyr-queries-9-0-0">Arrow dplyr queries<a class="anchor" aria-label="anchor" href="#arrow-dplyr-queries-9-0-0"></a></h3> <ul><li>New dplyr verbs: @@ -543,7 +543,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="8.0.0" id="arrow-800">arrow 8.0.0<a class="anchor" aria-label="anchor" href="#arrow-800"></a></h2><p class="text-muted">CRAN release: 2022-05-09</p> +<h2 class="pkg-version" data-toc-text="8.0.0" id="arrow-800">arrow 8.0.0<a class="anchor" aria-label="anchor" href="#arrow-800"></a></h2> <div class="section level3"> <h3 id="enhancements-to-dplyr-and-datasets-8-0-0">Enhancements to dplyr and datasets<a class="anchor" aria-label="anchor" href="#enhancements-to-dplyr-and-datasets-8-0-0"></a></h3> <ul><li> @@ -655,7 +655,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="7.0.0" id="arrow-700">arrow 7.0.0<a class="anchor" aria-label="anchor" href="#arrow-700"></a></h2><p class="text-muted">CRAN release: 2022-02-10</p> +<h2 class="pkg-version" data-toc-text="7.0.0" id="arrow-700">arrow 7.0.0<a class="anchor" aria-label="anchor" href="#arrow-700"></a></h2> <div class="section level3"> <h3 id="enhancements-to-dplyr-and-datasets-7-0-0">Enhancements to dplyr and datasets<a class="anchor" aria-label="anchor" href="#enhancements-to-dplyr-and-datasets-7-0-0"></a></h3> <ul><li>Additional <a href="https://lubridate.tidyverse.org" class="external-link">lubridate</a> features: <code>week()</code>, more of the <code>is.*()</code> functions, and the label argument to <code>month()</code> have been implemented.</li> @@ -710,7 +710,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="6.0.1" id="arrow-601">arrow 6.0.1<a class="anchor" aria-label="anchor" href="#arrow-601"></a></h2><p class="text-muted">CRAN release: 2021-11-20</p> +<h2 class="pkg-version" data-toc-text="6.0.1" id="arrow-601">arrow 6.0.1<a class="anchor" aria-label="anchor" href="#arrow-601"></a></h2> <ul><li>Joins now support inclusion of dictionary columns, and multiple crashes have been fixed</li> <li>Grouped aggregation no longer crashes when working on data that has been filtered down to 0 rows</li> <li>Bindings added for <code>str_count()</code> in dplyr queries</li> @@ -784,11 +784,11 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="5.0.0.2" id="arrow-5002">arrow 5.0.0.2<a class="anchor" aria-label="anchor" href="#arrow-5002"></a></h2><p class="text-muted">CRAN release: 2021-09-05</p> +<h2 class="pkg-version" data-toc-text="5.0.0.2" id="arrow-5002">arrow 5.0.0.2<a class="anchor" aria-label="anchor" href="#arrow-5002"></a></h2> <p>This patch version contains fixes for some sanitizer and compiler warnings.</p> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="5.0.0" id="arrow-500">arrow 5.0.0<a class="anchor" aria-label="anchor" href="#arrow-500"></a></h2><p class="text-muted">CRAN release: 2021-07-29</p> +<h2 class="pkg-version" data-toc-text="5.0.0" id="arrow-500">arrow 5.0.0<a class="anchor" aria-label="anchor" href="#arrow-500"></a></h2> <div class="section level3"> <h3 id="more-dplyr-5-0-0">More dplyr<a class="anchor" aria-label="anchor" href="#more-dplyr-5-0-0"></a></h3> <ul><li> @@ -838,17 +838,17 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="4.0.1" id="arrow-401">arrow 4.0.1<a class="anchor" aria-label="anchor" href="#arrow-401"></a></h2><p class="text-muted">CRAN release: 2021-05-28</p> +<h2 class="pkg-version" data-toc-text="4.0.1" id="arrow-401">arrow 4.0.1<a class="anchor" aria-label="anchor" href="#arrow-401"></a></h2> <ul><li>Resolved a few bugs in new string compute kernels (<a href="https://github.com/apache/arrow/issues/10320" class="external-link">#10320</a>, <a href="https://github.com/apache/arrow/issues/10287" class="external-link">#10287</a>)</li> </ul></div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="4.0.0.1" id="arrow-4001">arrow 4.0.0.1<a class="anchor" aria-label="anchor" href="#arrow-4001"></a></h2><p class="text-muted">CRAN release: 2021-05-10</p> +<h2 class="pkg-version" data-toc-text="4.0.0.1" id="arrow-4001">arrow 4.0.0.1<a class="anchor" aria-label="anchor" href="#arrow-4001"></a></h2> <ul><li>The mimalloc memory allocator is the default memory allocator when using a static source build of the package on Linux. This is because it has better behavior under valgrind than jemalloc does. A full-featured build (installed with <code>LIBARROW_MINIMAL=false</code>) includes both jemalloc and mimalloc, and it has still has jemalloc as default, though this is configurable at runtime with the <code>ARROW_DEFAULT_MEMORY_POOL</code> environment variable.</li> <li>Environment variables <code>LIBARROW_MINIMAL</code>, <code>LIBARROW_DOWNLOAD</code>, and <code>NOT_CRAN</code> are now case-insensitive in the Linux build script.</li> <li>A build configuration issue in the macOS binary package has been resolved.</li> </ul></div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="4.0.0" id="arrow-400">arrow 4.0.0<a class="anchor" aria-label="anchor" href="#arrow-400"></a></h2><p class="text-muted">CRAN release: 2021-04-27</p> +<h2 class="pkg-version" data-toc-text="4.0.0" id="arrow-400">arrow 4.0.0<a class="anchor" aria-label="anchor" href="#arrow-400"></a></h2> <div class="section level3"> <h3 id="dplyr-methods-4-0-0">dplyr methods<a class="anchor" aria-label="anchor" href="#dplyr-methods-4-0-0"></a></h3> <p>Many more <code>dplyr</code> verbs are supported on Arrow objects:</p> @@ -916,7 +916,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="3.0.0" id="arrow-300">arrow 3.0.0<a class="anchor" aria-label="anchor" href="#arrow-300"></a></h2><p class="text-muted">CRAN release: 2021-01-27</p> +<h2 class="pkg-version" data-toc-text="3.0.0" id="arrow-300">arrow 3.0.0<a class="anchor" aria-label="anchor" href="#arrow-300"></a></h2> <div class="section level3"> <h3 id="python-and-flight-3-0-0">Python and Flight<a class="anchor" aria-label="anchor" href="#python-and-flight-3-0-0"></a></h3> <ul><li>Flight methods <code><a href="../reference/flight_get.html">flight_get()</a></code> and <code><a href="../reference/flight_put.html">flight_put()</a></code> (renamed from <code>push_data()</code> in this release) can handle both Tables and RecordBatches</li> @@ -968,7 +968,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="2.0.0" id="arrow-200">arrow 2.0.0<a class="anchor" aria-label="anchor" href="#arrow-200"></a></h2><p class="text-muted">CRAN release: 2020-10-20</p> +<h2 class="pkg-version" data-toc-text="2.0.0" id="arrow-200">arrow 2.0.0<a class="anchor" aria-label="anchor" href="#arrow-200"></a></h2> <div class="section level3"> <h3 id="datasets-2-0-0">Datasets<a class="anchor" aria-label="anchor" href="#datasets-2-0-0"></a></h3> <ul><li> @@ -1019,7 +1019,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="1.0.1" id="arrow-101">arrow 1.0.1<a class="anchor" aria-label="anchor" href="#arrow-101"></a></h2><p class="text-muted">CRAN release: 2020-08-28</p> +<h2 class="pkg-version" data-toc-text="1.0.1" id="arrow-101">arrow 1.0.1<a class="anchor" aria-label="anchor" href="#arrow-101"></a></h2> <div class="section level3"> <h3 id="bug-fixes-1-0-1">Bug fixes<a class="anchor" aria-label="anchor" href="#bug-fixes-1-0-1"></a></h3> <ul><li>Filtering a Dataset that has multiple partition keys using an <code>%in%</code> expression now faithfully returns all relevant rows</li> @@ -1032,7 +1032,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="1.0.0" id="arrow-100">arrow 1.0.0<a class="anchor" aria-label="anchor" href="#arrow-100"></a></h2><p class="text-muted">CRAN release: 2020-07-25</p> +<h2 class="pkg-version" data-toc-text="1.0.0" id="arrow-100">arrow 1.0.0<a class="anchor" aria-label="anchor" href="#arrow-100"></a></h2> <div class="section level3"> <h3 id="arrow-format-conversion-1-0-0">Arrow format conversion<a class="anchor" aria-label="anchor" href="#arrow-format-conversion-1-0-0"></a></h3> <ul><li> @@ -1085,14 +1085,14 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="0.17.1" id="arrow-0171">arrow 0.17.1<a class="anchor" aria-label="anchor" href="#arrow-0171"></a></h2><p class="text-muted">CRAN release: 2020-05-19</p> +<h2 class="pkg-version" data-toc-text="0.17.1" id="arrow-0171">arrow 0.17.1<a class="anchor" aria-label="anchor" href="#arrow-0171"></a></h2> <ul><li>Updates for compatibility with <code>dplyr</code> 1.0</li> <li> <code><a href="https://rstudio.github.io/reticulate/reference/r-py-conversion.html" class="external-link">reticulate::r_to_py()</a></code> conversion now correctly works automatically, without having to call the method yourself</li> <li>Assorted bug fixes in the C++ library around Parquet reading</li> </ul></div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="0.17.0" id="arrow-0170">arrow 0.17.0<a class="anchor" aria-label="anchor" href="#arrow-0170"></a></h2><p class="text-muted">CRAN release: 2020-04-21</p> +<h2 class="pkg-version" data-toc-text="0.17.0" id="arrow-0170">arrow 0.17.0<a class="anchor" aria-label="anchor" href="#arrow-0170"></a></h2> <div class="section level3"> <h3 id="feather-v2-0-17-0">Feather v2<a class="anchor" aria-label="anchor" href="#feather-v2-0-17-0"></a></h3> <p>This release includes support for version 2 of the Feather file format. Feather v2 features full support for all Arrow data types, fixes the 2GB per-column limitation for large amounts of string data, and it allows files to be compressed using either <code>lz4</code> or <code>zstd</code>. <code><a href="../reference/write_feather.html">write_feather()</a></code> can write either version 2 or <a href="https://github.com/wesm/feather" class="external-link">version 1</a> Feather files, a [...] @@ -1134,7 +1134,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="0.16.0.2" id="arrow-01602">arrow 0.16.0.2<a class="anchor" aria-label="anchor" href="#arrow-01602"></a></h2><p class="text-muted">CRAN release: 2020-02-14</p> +<h2 class="pkg-version" data-toc-text="0.16.0.2" id="arrow-01602">arrow 0.16.0.2<a class="anchor" aria-label="anchor" href="#arrow-01602"></a></h2> <ul><li> <code><a href="../reference/install_arrow.html">install_arrow()</a></code> now installs the latest release of <code>arrow</code>, including Linux dependencies, either for CRAN releases or for development builds (if <code>nightly = TRUE</code>)</li> <li>Package installation on Linux no longer downloads C++ dependencies unless the <code>LIBARROW_DOWNLOAD</code> or <code>NOT_CRAN</code> environment variable is set</li> @@ -1143,7 +1143,7 @@ <li>Can now infer the type of an R <code>list</code> and create a ListArray when all list elements are the same type (<a href="https://github.com/apache/arrow/issues/6275" class="external-link">#6275</a>, <a href="https://github.com/michaelchirico" class="external-link">@michaelchirico</a>)</li> </ul></div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="0.16.0" id="arrow-0160">arrow 0.16.0<a class="anchor" aria-label="anchor" href="#arrow-0160"></a></h2><p class="text-muted">CRAN release: 2020-02-09</p> +<h2 class="pkg-version" data-toc-text="0.16.0" id="arrow-0160">arrow 0.16.0<a class="anchor" aria-label="anchor" href="#arrow-0160"></a></h2> <div class="section level3"> <h3 id="multi-file-datasets-0-16-0">Multi-file datasets<a class="anchor" aria-label="anchor" href="#multi-file-datasets-0-16-0"></a></h3> <p>This release includes a <code>dplyr</code> interface to Arrow Datasets, which let you work efficiently with large, multi-file datasets as a single entity. Explore a directory of data files with <code><a href="../reference/open_dataset.html">open_dataset()</a></code> and then use <code>dplyr</code> methods to <code><a href="https://dplyr.tidyverse.org/reference/select.html" class="external-link">select()</a></code>, <code><a href="https://dplyr.tidyverse.org/reference/filter.html" clas [...] @@ -1178,11 +1178,11 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="0.15.1" id="arrow-0151">arrow 0.15.1<a class="anchor" aria-label="anchor" href="#arrow-0151"></a></h2><p class="text-muted">CRAN release: 2019-11-04</p> +<h2 class="pkg-version" data-toc-text="0.15.1" id="arrow-0151">arrow 0.15.1<a class="anchor" aria-label="anchor" href="#arrow-0151"></a></h2> <ul><li>This patch release includes bugfixes in the C++ library around dictionary types and Parquet reading.</li> </ul></div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="0.15.0" id="arrow-0150">arrow 0.15.0<a class="anchor" aria-label="anchor" href="#arrow-0150"></a></h2><p class="text-muted">CRAN release: 2019-10-07</p> +<h2 class="pkg-version" data-toc-text="0.15.0" id="arrow-0150">arrow 0.15.0<a class="anchor" aria-label="anchor" href="#arrow-0150"></a></h2> <div class="section level3"> <h3 id="breaking-changes-0-15-0">Breaking changes<a class="anchor" aria-label="anchor" href="#breaking-changes-0-15-0"></a></h3> <ul><li>The R6 classes that wrap the C++ classes are now documented and exported and have been renamed to be more R-friendly. Users of the high-level R interface in this package are not affected. Those who want to interact with the Arrow C++ API more directly should work with these objects and methods. As part of this change, many functions that instantiated these R6 objects have been removed in favor of <code>Class$create()</code> methods. Notably, <code><a href="https://rdrr.io/r/base/ [...] @@ -1210,7 +1210,7 @@ </ul></div> </div> <div class="section level2"> -<h2 class="pkg-version" data-toc-text="0.14.1" id="arrow-0141">arrow 0.14.1<a class="anchor" aria-label="anchor" href="#arrow-0141"></a></h2><p class="text-muted">CRAN release: 2019-08-05</p> +<h2 class="pkg-version" data-toc-text="0.14.1" id="arrow-0141">arrow 0.14.1<a class="anchor" aria-label="anchor" href="#arrow-0141"></a></h2> <p>Initial CRAN release of the <code>arrow</code> package. Key features include:</p> <ul><li>Read and write support for various file formats, including Parquet, Feather/Arrow, CSV, and JSON.</li> <li>API bindings to the C++ library for Arrow data types and objects, as well as mapping between Arrow types and R data types.</li> diff --git a/docs/dev/r/pkgdown.yml b/docs/dev/r/pkgdown.yml index 74a2c55cc58..b6acac536e4 100644 --- a/docs/dev/r/pkgdown.yml +++ b/docs/dev/r/pkgdown.yml @@ -21,7 +21,7 @@ articles: read_write: read_write.html developers/setup: developers/setup.html developers/workflow: developers/workflow.html -last_built: 2025-09-06T01:22Z +last_built: 2025-09-07T01:25Z urls: reference: https://arrow.apache.org/docs/r/reference article: https://arrow.apache.org/docs/r/articles diff --git a/docs/dev/r/reference/to_arrow.html b/docs/dev/r/reference/to_arrow.html index 8a162de7f42..1336915d943 100644 --- a/docs/dev/r/reference/to_arrow.html +++ b/docs/dev/r/reference/to_arrow.html @@ -121,9 +121,9 @@ result to materialize the entire Table in-memory.</p> <span class="r-out co"><span class="r-pr">#></span> <span style="color: #949494;"># A tibble: 3 x 2</span></span> <span class="r-out co"><span class="r-pr">#></span> cyl mean_mpg</span> <span class="r-out co"><span class="r-pr">#></span> <span style="color: #949494; font-style: italic;"><dbl></span> <span style="color: #949494; font-style: italic;"><dbl></span></span> -<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">1</span> 4 23.7</span> -<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">2</span> 6 19.7</span> -<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">3</span> 8 15.1</span> +<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">1</span> 6 19.7</span> +<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">2</span> 8 15.1</span> +<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">3</span> 4 23.7</span> </code></pre></div> </div> </main><aside class="col-md-3"><nav id="toc" aria-label="Table of contents"><h2>On this page</h2> diff --git a/docs/dev/r/reference/to_duckdb.html b/docs/dev/r/reference/to_duckdb.html index 95f2423e2b7..5a4814efb3d 100644 --- a/docs/dev/r/reference/to_duckdb.html +++ b/docs/dev/r/reference/to_duckdb.html @@ -145,11 +145,11 @@ using them.</p> <span class="r-out co"><span class="r-pr">#></span> <span style="color: #949494;"># Groups: cyl</span></span> <span class="r-out co"><span class="r-pr">#></span> mpg cyl disp hp drat wt qsec vs am gear carb</span> <span class="r-out co"><span class="r-pr">#></span> <span style="color: #949494; font-style: italic;"><dbl></span> <span style="color: #949494; font-style: italic;"><dbl></span> <span style="color: #949494; font-style: italic;"><dbl></span> <span style="color: #949494; font-style: italic;"><dbl></span> <span style="color: #949494; font-style: italic;"><dbl></span> <span style="color: #949494; font-style: italic;"><dbl></span> <span style="color: # [...] -<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">1</span> 19.7 6 145 175 3.62 2.77 15.5 0 1 5 6</span> -<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">2</span> 27.3 4 79 66 4.08 1.94 18.9 1 1 4 1</span> -<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">3</span> 16.4 8 276. 180 3.07 4.07 17.4 0 0 3 3</span> -<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">4</span> 17.3 8 276. 180 3.07 3.73 17.6 0 0 3 3</span> -<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">5</span> 15.2 8 276. 180 3.07 3.78 18 0 0 3 3</span> +<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">1</span> 16.4 8 276. 180 3.07 4.07 17.4 0 0 3 3</span> +<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">2</span> 17.3 8 276. 180 3.07 3.73 17.6 0 0 3 3</span> +<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">3</span> 15.2 8 276. 180 3.07 3.78 18 0 0 3 3</span> +<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">4</span> 27.3 4 79 66 4.08 1.94 18.9 1 1 4 1</span> +<span class="r-out co"><span class="r-pr">#></span> <span style="color: #BCBCBC;">5</span> 19.7 6 145 175 3.62 2.77 15.5 0 1 5 6</span> </code></pre></div> </div> </main><aside class="col-md-3"><nav id="toc" aria-label="Table of contents"><h2>On this page</h2> diff --git a/docs/dev/r/search.json b/docs/dev/r/search.json index 6ba82d1ab5f..67e564b3984 100644 --- a/docs/dev/r/search.json +++ b/docs/dev/r/search.json @@ -1 +1 @@ -[{"path":"https://arrow.apache.org/docs/r/PACKAGING.html","id":null,"dir":"","previous_headings":"","what":"Packaging Checklist for CRAN Release","title":"Packaging Checklist for CRAN Release","text":"high-level overview Arrow release process see Apache Arrow Release Management Guide.","code":""},{"path":"https://arrow.apache.org/docs/r/PACKAGING.html","id":"before-the-arrow-release-candidate-is-created","dir":"","previous_headings":"","what":"Before the Arrow Release Candidate Is Create [...] +[{"path":"https://arrow.apache.org/docs/r/PACKAGING.html","id":null,"dir":"","previous_headings":"","what":"Packaging Checklist for CRAN Release","title":"Packaging Checklist for CRAN Release","text":"high-level overview Arrow release process see Apache Arrow Release Management Guide.","code":""},{"path":"https://arrow.apache.org/docs/r/PACKAGING.html","id":"before-the-arrow-release-candidate-is-created","dir":"","previous_headings":"","what":"Before the Arrow Release Candidate Is Create [...] diff --git a/docs/dev/searchindex.js b/docs/dev/searchindex.js index e3271600830..98ca9efb1b1 100644 --- a/docs/dev/searchindex.js +++ b/docs/dev/searchindex.js @@ -1 +1 @@ -Search.setIndex({"alltitles":{"1st pass":[[83,"st-pass"]],"2nd pass":[[83,"nd-pass"]],"32-bit hash vs 64-bit hash":[[83,"bit-hash-vs-64-bit-hash"]],"8-bit Boolean":[[126,"bit-boolean"]],"A Database":[[9,"a-database"]],"A Library for Data Scientists":[[9,"a-library-for-data-scientists"]],"A Note on Linking":[[38,"a-note-on-linking"]],"A note on transactions & ACID guarantees":[[42,"a-note-on-transactions-acid-guarantees"],[184,"a-note-on-transactions-acid-guarantees"]],"ABI Structures":[[ [...] \ No newline at end of file +Search.setIndex({"alltitles":{"1st pass":[[83,"st-pass"]],"2nd pass":[[83,"nd-pass"]],"32-bit hash vs 64-bit hash":[[83,"bit-hash-vs-64-bit-hash"]],"8-bit Boolean":[[126,"bit-boolean"]],"A Database":[[9,"a-database"]],"A Library for Data Scientists":[[9,"a-library-for-data-scientists"]],"A Note on Linking":[[38,"a-note-on-linking"]],"A note on transactions & ACID guarantees":[[42,"a-note-on-transactions-acid-guarantees"],[184,"a-note-on-transactions-acid-guarantees"]],"ABI Structures":[[ [...] \ No newline at end of file