[
https://issues.apache.org/jira/browse/ARROW-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated ARROW-14318:
-----------------------------------
Labels: pull-request-available (was: )
> [Doc][Python] Error building dataset docs
> -----------------------------------------
>
> Key: ARROW-14318
> URL: https://issues.apache.org/jira/browse/ARROW-14318
> Project: Apache Arrow
> Issue Type: Bug
> Components: Documentation, Python
> Reporter: Antoine Pitrou
> Assignee: Joris Van den Bossche
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> I get this error locally, even after removing what seems like leftovers from
> previous doc builds:
> {code}
> >>>-------------------------------------------------------------------------
> Exception in /home/antoine/arrow/dev/docs/source/python/dataset.rst at block
> ending on line 522
> Specify :okexcept: as an option in the ipython:: block to suppress this
> message
> ---------------------------------------------------------------------------
> ArrowInvalid Traceback (most recent call last)
> <ipython-input-58-affbef2c47b2> in <module>
> ----> 1 ds.write_dataset(table, dataset_root, format="parquet")
> ~/arrow/dev/python/pyarrow/dataset.py in write_dataset(data, base_dir,
> basename_template, format, partitioning, partitioning_flavor, schema,
> filesystem, file_options, use_threads, max_partitions, file_visitor)
> 859 scanner = data
> 860
> --> 861 _filesystemdataset_write(
> 862 scanner, base_dir, basename_template, filesystem,
> partitioning,
> 863 file_options, max_partitions, file_visitor
> ~/arrow/dev/python/pyarrow/_dataset.pyx in
> pyarrow._dataset._filesystemdataset_write()
> ~/arrow/dev/python/pyarrow/error.pxi in pyarrow.lib.check_status()
> ArrowInvalid: Could not write to /tmp/sample_dataset as the directory is not
> empty and existing_data_behavior is to error
> /home/antoine/arrow/dev/cpp/src/arrow/dataset/dataset_writer.cc:508
> EnsureDestinationValid(write_options)
> /home/antoine/arrow/dev/cpp/src/arrow/dataset/file_base.cc:424
> internal::DatasetWriter::Make(write_options)
> /home/antoine/arrow/dev/cpp/src/arrow/compute/exec/exec_plan.cc:433
> MakeExecNode(this->factory_name, plan, std::move(inputs), *this->options,
> registry)
> /home/antoine/arrow/dev/cpp/src/arrow/dataset/file_base.cc:395
> compute::Declaration::Sequence( { {"scan", ScanNodeOptions{dataset,
> scanner->options()}}, {"filter",
> compute::FilterNodeOptions{scanner->options()->filter}}, {"project",
> compute::ProjectNodeOptions{std::move(exprs), std::move(names)}}, {"write",
> WriteNodeOptions{write_options, scanner->options()->projected_schema}}, })
> .AddToPlan(plan.get())
> <<<-------------------------------------------------------------------------
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)