[
https://issues.apache.org/jira/browse/ARROW-16386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529091#comment-17529091
]
Larry Dawson commented on ARROW-16386:
--------------------------------------
This works fine on debian:latest, but breaks on Ubuntu. It turns out that
ubuntu is missing the timezone setup and adding the following line to the
docker file fixes the problem:
{code:java}
run DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get -y install tzdata {code}
Fixed in the docker code, but I'm not sure given this whether this is a bug in
pyarrow or not - the code I'm running is pretty simple and doesn't have a
functional dependency on the local timezone, so I don't think it should fail if
the timezone hasn't been set up, but I have to admit, it's pretty basic
functionality.
> Simple example arrow script fails on ubuntu:latest docker container
> -------------------------------------------------------------------
>
> Key: ARROW-16386
> URL: https://issues.apache.org/jira/browse/ARROW-16386
> Project: Apache Arrow
> Issue Type: Bug
> Environment: Active environment is described by Docker file, but I'm
> running docker on windows 10.
> Reporter: Larry Dawson
> Priority: Major
>
> This docker file using ubuntu:latest, which at this time equates to
> ubuntu:jammy:
> {code:java}
> from ubuntu:latest
> run apt update -y
> run apt upgrade -y
> run apt install -y vim libssl-dev libpq-dev python3 python3-venv
> build-essential
> run cd /opt && mkdir python_environments && cd python_environments && python3
> -m venv venv && . venv/bin/activate && python -m pip install --upgrade pip &&
> pip install pyarrow pandas
> run cd /opt/python_environments && . venv/bin/activate && python -c "import
> pandas as pd; import pyarrow as pa; from pyarrow import orc;
> orc.write_table(pa.table({'col1': [1,2,3]}), 'test.orc'); pdtbl =
> orc.read_table('test.orc')" {code}
> Fails on the orc.read_table command with this error:
> {code:java}
> #9 0.939 terminate called after throwing an instance of 'orc::TimezoneError'
> #9 0.939 what(): Can't open /etc/localtime
> #9 0.944 Aborted {code}
> The error report is accurate, there is no /etc/localtime file in the
> ubuntu:latest docker image
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)