Todd Farmer created ARROW-17217:
-----------------------------------
Summary: [Docs][Python] Building documentation requires pandas
Key: ARROW-17217
URL: https://issues.apache.org/jira/browse/ARROW-17217
Project: Apache Arrow
Issue Type: Bug
Components: Documentation, Python
Reporter: Todd Farmer
The [build instructions for
documentation|https://arrow.apache.org/docs/developers/documentation.html]
guide users to apply
[conda_env_sphinx.txt|https://github.com/apache/arrow/blob/master/ci/conda_env_sphinx.txt]
in order to build the documentation, but this file does not include pandas,
which triggers the following build error:
{code:java}
(test-nightlies) todd@pop-os:~/arrow/docs$ make html
sphinx-build -b html -d _build/doctrees -j8 source _build/html
Running Sphinx v5.1.0
[autosummary] generating autosummary for: c_glib/index.rst, cpp/api.rst,
cpp/api/array.rst, cpp/api/async.rst, cpp/api/builder.rst, cpp/api/c_abi.rst,
cpp/api/compute.rst, cpp/api/cuda.rst, cpp/api/dataset.rst,
cpp/api/datatype.rst, ..., python/json.rst, python/memory.rst,
python/numpy.rst, python/orc.rst, python/pandas.rst, python/parquet.rst,
python/plasma.rst, python/timestamps.rst, r/index.rst, status.rst
loading intersphinx inventory from https://docs.python.org/3/objects.inv...
loading intersphinx inventory from https://numpy.org/doc/stable/objects.inv...
loading intersphinx inventory from https://pandas.pydata.org/docs/objects.inv...
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 842 source files that are out of date
updating environment: [new config] 842 added, 0 changed, 0 removed
reading sources... [ 7%] cpp/examples/compute_and_write_example ..
developers/cpp/fuzzing
Sphinx parallel build error:
RuntimeError: Non Expected exception in
`/home/todd/arrow/docs/source/python/pandas.rst` line 38
make: *** [Makefile:81: html] Error 2{code}
Adding pandas to conda_env_sphinx.txt and re-installing packages from that file
result in successful builds:
{code:java}
(test-nightlies) todd@pop-os:~/arrow/docs$ conda install -c conda-forge --file
../ci/conda_env_sphinx.txt
Collecting package metadata (current_repodata.json): done
Solving environment: done## Package Plan ## environment location:
/home/todd/miniconda3/envs/test-nightlies added / updated specs:
- breathe
- doxygen
- ipython
- numpydoc
- pandas
- pydata-sphinx-theme==0.8
- pytest-cython
- sphinx-copybutton
- sphinx-design
- sphinx[version='>=4.2']
The following NEW packages will be INSTALLED: pandas
conda-forge/linux-64::pandas-1.4.3-py39h1832856_0
python-dateutil conda-forge/noarch::python-dateutil-2.8.2-pyhd8ed1ab_0
Proceed ([y]/n)? yPreparing transaction: done
Verifying transaction: done
Executing transaction: done
(test-nightlies) todd@pop-os:~/arrow/docs$ make html
sphinx-build -b html -d _build/doctrees -j8 source _build/html
Running Sphinx v5.1.0
[autosummary] generating autosummary for: c_glib/index.rst, cpp/api.rst,
cpp/api/array.rst, cpp/api/async.rst, cpp/api/builder.rst, cpp/api/c_abi.rst,
cpp/api/compute.rst, cpp/api/cuda.rst, cpp/api/dataset.rst,
cpp/api/datatype.rst, ..., python/json.rst, python/memory.rst,
python/numpy.rst, python/orc.rst, python/pandas.rst, python/parquet.rst,
python/plasma.rst, python/timestamps.rst, r/index.rst, status.rst
loading intersphinx inventory from https://docs.python.org/3/objects.inv...
loading intersphinx inventory from https://numpy.org/doc/stable/objects.inv...
loading intersphinx inventory from https://pandas.pydata.org/docs/objects.inv...
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 842 source files that are out of date
updating environment: [new config] 842 added, 0 changed, 0 removed
reading sources... [ 7%] cpp/examples/compute_and_write_example ..
developersreading sources... [ 18%] python/api/flight ..
python/generated/pyarrow.Date32reading sources... [ 22%]
python/generated/pyarrow.Date32Scalar .. python/genereading sources... [ 25%]
python/generated/pyarrow.HadoopFileSystem.get_space_reading sources... [ 29%]
python/generated/pyarrow.MonthDayNanoIntervalArray .reading sources... [ 33%]
python/generated/pyarrow.TimestampType .. python/genreading sources... [ 37%]
python/generated/pyarrow.compute.MatchSubstringOptioreading sources... [ 40%]
python/generated/pyarrow.compute.all .. python/generreading sources... [ 44%]
python/generated/pyarrow.compute.ascii_trim_whitespareading sources... [ 48%]
python/generated/pyarrow.compute.day_of_week .. pythreading sources... [ 51%]
python/generated/pyarrow.compute.is_null .. python/greading sources... [ 55%]
python/generated/pyarrow.compute.milliseconds_betweereading sources... [ 59%]
python/generated/pyarrow.compute.second .. python/gereading sources... [ 62%]
python/generated/pyarrow.compute.us_week .. python/greading sources... [ 66%]
python/generated/pyarrow.compute.variance .. python/reading sources... [ 70%]
python/generated/pyarrow.dataset.CsvFileFormat .. pyreading sources... [ 74%]
python/generated/pyarrow.date64 .. python/generated/reading sources... [ 77%]
python/generated/pyarrow.flight.FlightServerError ..reading sources... [ 81%]
python/generated/pyarrow.fs.LocalFileSystem .. pythoreading sources... [ 85%]
python/generated/pyarrow.ipc.open_stream .. python/greading sources... [ 88%]
python/generated/pyarrow.parquet.ParquetLogicalType reading sources... [ 92%]
python/generated/pyarrow.struct .. python/generated/reading sources... [ 96%]
python/generated/pyarrow.types.is_signed_integer .. reading sources... [100%]
python/json .. status
/home/todd/miniconda3/envs/test-nightlies/lib/python3.9/site-packages/pyarrow/parquet/__init__.py:docstring
of pyarrow.parquet.write_to_dataset:94: WARNING: Literal block ends without a
blank line; unexpected unindent.
WARNING: don't know which module to import for autodocumenting 'BufferReader'
(try placing a "module" or "currentmodule" directive in the document, or giving
an explicit module name)
WARNING: don't know which module to import for autodocumenting 'BufferWriter'
(try placing a "module" or "currentmodule" directive in the document, or giving
an explicit module name)
WARNING: don't know which module to import for autodocumenting 'Context' (try
placing a "module" or "currentmodule" directive in the document, or giving an
explicit module name)
WARNING: don't know which module to import for autodocumenting 'CudaBuffer'
(try placing a "module" or "currentmodule" directive in the document, or giving
an explicit module name)
WARNING: don't know which module to import for autodocumenting 'HostBuffer'
(try placing a "module" or "currentmodule" directive in the document, or giving
an explicit module name)
WARNING: don't know which module to import for autodocumenting 'IpcMemHandle'
(try placing a "module" or "currentmodule" directive in the document, or giving
an explicit module name)
WARNING: don't know which module to import for autodocumenting
'new_host_buffer' (try placing a "module" or "currentmodule" directive in the
document, or giving an explicit module name)
WARNING: don't know which module to import for autodocumenting 'read_message'
(try placing a "module" or "currentmodule" directive in the document, or giving
an explicit module name)
WARNING: don't know which module to import for autodocumenting
'read_record_batch' (try placing a "module" or "currentmodule" directive in the
document, or giving an explicit module name)
WARNING: don't know which module to import for autodocumenting
'serialize_record_batch' (try placing a "module" or "currentmodule" directive
in the document, or giving an explicit module name)
/home/todd/arrow/docs/source/cpp/api/dataset.rst:62: WARNING: Parsing of
expression failed. Using fallback parser. Error was:
Error in postfix expression, expected primary expression or type.
If primary expression:
Invalid C++ declaration: Expected identifier in nested name. [error at 59]
std::function< Status(FileWriter *)> writer_pre_finish = [](FileWriter*)
{returnStatus::OK();}
-----------------------------------------------------------^
If type:
Invalid C++ declaration: Expected identifier in nested name. [error at 59]
std::function< Status(FileWriter *)> writer_pre_finish = [](FileWriter*)
{returnStatus::OK();}
-----------------------------------------------------------^/home/todd/arrow/docs/source/cpp/api/dataset.rst:62:
WARNING: Parsing of expression failed. Using fallback parser. Error was:
Error in postfix expression, expected primary expression or type.
If primary expression:
Invalid C++ declaration: Expected identifier in nested name. [error at 60]
std::function< Status(FileWriter *)> writer_post_finish = [](FileWriter*)
{returnStatus::OK();}
------------------------------------------------------------^
If type:
Invalid C++ declaration: Expected identifier in nested name. [error at 60]
std::function< Status(FileWriter *)> writer_post_finish = [](FileWriter*)
{returnStatus::OK();}
------------------------------------------------------------^/home/todd/arrow/docs/source/cpp/api/dataset.rst:69:
WARNING: Duplicate C++ declaration, also defined at cpp/api/dataset:69.
Declaration is '.. cpp:function:: virtual Result< std::shared_ptr< FileFragment
> > MakeFragment (FileSource source, compute::Expression partition_expression,
std::shared_ptr< Schema > physical_schema)'.
/home/todd/arrow/docs/source/cpp/api/flight.rst:159: WARNING: Duplicate C++
declaration, also defined at cpp/api/flight:159.
Declaration is '.. cpp:function:: virtual arrow::Result< FlightPayload >
GetSchemaPayload ()=0'.
/home/todd/arrow/docs/source/cpp/api/flightsql.rst:48: WARNING:
doxygenfunction: Unable to resolve function
"arrow::flight::sql::CreateStatementQueryTicket" with arguments "None".
Candidate function could not be parsed. Parsing error is
Error when parsing function declaration.
If the function has no return type:
Error in declarator or parameters-and-qualifiers
Invalid C++ declaration: Expecting "(" in parameters-and-qualifiers. [error
at 24]
ARROW_FLIGHT_SQL_EXPORT arrow::Result< std::string >
CreateStatementQueryTicket (const std::string &statement_handle)
------------------------^
If the function has a return type:
Error in declarator or parameters-and-qualifiers
If pointer to member declarator:
Invalid C++ declaration: Expected '::' in pointer to member (function).
[error at 53]
ARROW_FLIGHT_SQL_EXPORT arrow::Result< std::string >
CreateStatementQueryTicket (const std::string &statement_handle)
-----------------------------------------------------^
If declarator-id:
Invalid C++ declaration: Expecting "(" in parameters-and-qualifiers. [error
at 53]
ARROW_FLIGHT_SQL_EXPORT arrow::Result< std::string >
CreateStatementQueryTicket (const std::string &statement_handle)
-----------------------------------------------------^
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [ 8%] cpp/examples/row_columnar_conversion ..
developers/guwriting output... [ 24%] python/generated/pyarrow.DurationScalar
.. python/genwriting output... [ 28%] python/generated/pyarrow.Int64Array ..
python/generatwriting output... [ 32%]
python/generated/pyarrow.SerializedPyObject .. pythonwriting output... [ 36%]
python/generated/pyarrow.compress .. python/generatedwriting output... [ 40%]
python/generated/pyarrow.compute.StrptimeOptions .. pwriting output... [ 44%]
python/generated/pyarrow.compute.ascii_lpad .. pythonwriting output... [ 48%]
python/generated/pyarrow.compute.cos .. python/generawriting output... [ 52%]
python/generated/pyarrow.compute.indices_nonzero .. pwriting output... [ 56%]
python/generated/pyarrow.compute.max_element_wise .. writing output... [ 60%]
python/generated/pyarrow.compute.round .. python/genewriting output... [ 64%]
python/generated/pyarrow.compute.unique .. python/genwriting output... [ 68%]
python/generated/pyarrow.compute.week .. python/generwriting output... [ 72%]
python/generated/pyarrow.dataset.DirectoryPartitioninwriting output... [ 76%]
python/generated/pyarrow.deserialize_components .. pywriting output... [ 80%]
python/generated/pyarrow.flight.FlightWriteSizeExceedwriting output... [ 84%]
python/generated/pyarrow.get_include .. python/generawriting output... [ 88%]
python/generated/pyarrow.large_list .. python/generatwriting output... [ 92%]
python/generated/pyarrow.parquet.read_table .. pythonwriting output... [ 96%]
python/generated/pyarrow.types.is_float16 .. python/gwriting output... [100%]
python/generated/pyarrow.uint64 .. status
generating indices... genindex done
highlighting module code... [100%] pyarrow.types
writing additional pages... search done
copying images... [ 47%]
developers/images/python_tutorial_jira_description.jpcopying images... [ 57%]
developers/images/python_tutorial_github_find_in_filecopying images... [ 60%]
developers/images/python_tutorial_github_pr_notice.jpcopying images... [ 97%]
format/FlightSql/CommandPreparedStatementQuery.mmd.svcopying images... [100%]
python/py_arch_overview.svg
copying downloadable files... [100%]
../../python/examples/parquet_encryption/sample_vault_kms_client.py
copying static files... done
copying extra files... done
dumping search index in English (code: en)... done
dumping object inventory... done
build succeeded, 16 warnings.The HTML pages are in _build/html.Build finished.
The HTML pages are in _build/html.
(test-nightlies) todd@pop-os:~/arrow/docs$
{code}
Note also that
[docs/requirements.txt|https://github.com/apache/arrow/blob/master/docs/requirements.txt]
also does not include pandas. While I haven't tested the pip dependency path,
I presume it is similarly impacted and should be updated at the same time.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)