[ 
https://issues.apache.org/jira/browse/ARROW-16386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529091#comment-17529091
 ] 

Larry Dawson commented on ARROW-16386:
--------------------------------------

This works fine on debian:latest, but breaks on Ubuntu. It turns out that 
ubuntu is missing the timezone setup and adding the following line to the 
docker file fixes the problem:
{code:java}
run DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get -y install tzdata {code}
Fixed in the docker code, but I'm not sure given this whether this is a bug in 
pyarrow or not  - the code I'm running is pretty simple and doesn't have a 
functional dependency on the local timezone, so I don't think it should fail if 
the timezone hasn't been set up, but I have to admit, it's pretty basic 
functionality.

> Simple example arrow script fails on ubuntu:latest docker container
> -------------------------------------------------------------------
>
>                 Key: ARROW-16386
>                 URL: https://issues.apache.org/jira/browse/ARROW-16386
>             Project: Apache Arrow
>          Issue Type: Bug
>         Environment: Active environment is described by Docker file, but I'm 
> running docker on windows 10.
>            Reporter: Larry Dawson
>            Priority: Major
>
> This docker file using ubuntu:latest, which at this time equates to 
> ubuntu:jammy:
> {code:java}
> from ubuntu:latest
> run apt update -y
> run apt upgrade -y
> run apt install -y vim libssl-dev libpq-dev python3 python3-venv 
> build-essential
> run cd /opt && mkdir python_environments && cd python_environments && python3 
> -m venv venv && . venv/bin/activate && python -m pip install --upgrade pip && 
> pip install pyarrow pandas
> run cd /opt/python_environments && . venv/bin/activate && python -c "import 
> pandas as pd; import pyarrow as pa; from pyarrow import orc; 
> orc.write_table(pa.table({'col1': [1,2,3]}), 'test.orc'); pdtbl = 
> orc.read_table('test.orc')" {code}
> Fails on the orc.read_table command with this error:
> {code:java}
> #9 0.939 terminate called after throwing an instance of 'orc::TimezoneError'
> #9 0.939   what():  Can't open /etc/localtime
> #9 0.944 Aborted {code}
> The error report is accurate, there is no /etc/localtime file in the 
> ubuntu:latest docker image
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to