[
https://issues.apache.org/jira/browse/ARROW-10276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213484#comment-17213484
]
utsav edited comment on ARROW-10276 at 10/13/20, 11:34 PM:
-----------------------------------------------------------
An update.
I tried running the code in a script
20/10/13 23:29:04 WARN NativeCodeLoader: Unable to load native-hadoop library
for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
setLogLevel(newLevel).
20/10/13 23:29:13 WARN Utils: Service 'SparkUI' could not bind on port 4040.
Attempting port 4041.
20/10/13 23:29:31 WARN SizeEstimator: Failed to check whether
UseCompressedOops is set; assuming yes
+-----------+--+
|_c0|_c1|
+-----------+--+
|1582999200|1|
|1582999260|1|
|1582999320|1|
|1582999380|1|
|1582999440|1|
|1582999500|1|
|1582999560|1|
|1582999620|1|
|1582999680|1|
|1582999740|1|
|1582999800|1|
|1582999860|1|
|1582999920|1|
|1582999980|1|
|1583000040|1|
|1583000100|1|
|1583000160|1|
|1583000220|1|
|1583000280|1|
|1583000340|1|
+-----------+--+
only showing top 20 rows
/opt/spark/python/pyspark/sql/dataframe.py:2110: UserWarning: toPandas
attempted Arrow optimization because 'spark.sql.execution.arrow.enabled' is set
to true; however, failed by the reason below:
PyArrow >= 0.8.0 must be installed; however, it was not found.
Attempting non-optimization as 'spark.sql.execution.arrow.fallback.enabled' is
set to true.
warnings.warn(msg)
I then did:-
`pip3 show pyarrow `
Name: pyarrow
Version: 0.17.0
Summary: Python library for Apache Arrow
Home-page: [https://arrow.apache.org/]
Author: Apache Arrow Developers
Author-email: [email protected]
License: Apache License, Version 2.0
Location: /home/xilinx/.local/lib/python3.6/site-packages
Requires: numpy
Required-by:
It definitely exist in my PYTHONPATH as I added the following in bashrc and
sourced it to activate
`export PYTHONPATH=/home/xilinx/.local/lib/python3.6/site-packages:$PYTHONPATH`
was (Author: utri092):
An update.
I tried running the code in a script
20/10/13 23:29:04 WARN NativeCodeLoader: Unable to load native-hadoop library
for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
setLogLevel(newLevel).
20/10/13 23:29:13 WARN Utils: Service 'SparkUI' could not bind on port 4040.
Attempting port 4041.
20/10/13 23:29:31 WARN SizeEstimator: Failed to check whether UseCompressedOops
is set; assuming yes
+----------+---+
| _c0|_c1|
+----------+---+
|1582999200| 1|
|1582999260| 1|
|1582999320| 1|
|1582999380| 1|
|1582999440| 1|
|1582999500| 1|
|1582999560| 1|
|1582999620| 1|
|1582999680| 1|
|1582999740| 1|
|1582999800| 1|
|1582999860| 1|
|1582999920| 1|
|1582999980| 1|
|1583000040| 1|
|1583000100| 1|
|1583000160| 1|
|1583000220| 1|
|1583000280| 1|
|1583000340| 1|
+----------+---+
only showing top 20 rows
/opt/spark/python/pyspark/sql/dataframe.py:2110: UserWarning: toPandas
attempted Arrow optimization because 'spark.sql.execution.arrow.enabled' is set
to true; however, failed by the reason below:
PyArrow >= 0.8.0 must be installed; however, it was not found.
Attempting non-optimization as 'spark.sql.execution.arrow.fallback.enabled' is
set to true.
warnings.warn(msg)
I then did
pip3 show pyarrow
Name: pyarrow
Version: 0.17.0
Summary: Python library for Apache Arrow
Home-page: https://arrow.apache.org/
Author: Apache Arrow Developers
Author-email: [email protected]
License: Apache License, Version 2.0
Location: /home/xilinx/.local/lib/python3.6/site-packages
Requires: numpy
Required-by:
It definitely exist in my PYTHONPATH as I added the following in bashrc and
sourced it to activate
export PYTHONPATH=/home/xilinx/.local/lib/python3.6/site-packages:$PYTHONPATH
> Armv7 orc and flight not supported for build. Compat error on using with spark
> ------------------------------------------------------------------------------
>
> Key: ARROW-10276
> URL: https://issues.apache.org/jira/browse/ARROW-10276
> Project: Apache Arrow
> Issue Type: Bug
> Affects Versions: 0.17.0
> Reporter: utsav
> Priority: Major
> Attachments: arrow_compat_error, build_pip_wheel.sh,
> dpu_stream_spark.ipynb, get_arrow_and_create_venv.sh, run_build.sh
>
>
> I'm using a Arm Cortex A9 processor on the Xilinx Pynq Z2 board. People have
> tried to use it for the raspberry pi 3 without luck in previous posts.
> I figured out how to successfully build it for armv7 using the script below
> but cannot use orc and flight flags. People had looked into it in ARROW-8420
> but I don't know if they faced these issues.
> I tried converting a spark dataframe to pandas using pyarrow but now it
> complains about a compat feature. I have attached images below
> Any help would be appreciated. Thanks
> Spark Version: 2.4.5.
> The code is as follows:
> ```
> import pandas as pd
> df_pd = df.toPandas()
> npArr = df_pd.to_numpy()
> ```
> The error is as follows:-
> ```
> /opt/spark/python/pyspark/sql/dataframe.py:2110: UserWarning: toPandas
> attempted Arrow optimization because 'spark.sql.execution.arrow.enabled' is
> set to true; however, failed by the reason below:
> module 'pyarrow' has no attribute 'compat'
> Attempting non-optimization as 'spark.sql.execution.arrow.fallback.enabled'
> is set to true.
> warnings.warn(msg)
> ```
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)