[
https://issues.apache.org/jira/browse/ARROW-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17092026#comment-17092026
]
Shawn Li commented on ARROW-8435:
---------------------------------
Hi Will,
I posted the issue there as well because I'm not sure what the root cause is
and where it belongs to as the issue occurred while using the
`write_to_dataset` method of pyarrow. Thank for linking them together. By the
way, what a small world, I hope you're doing well!
> [Python] A TypeError is raised while token expires during writing to S3
> -----------------------------------------------------------------------
>
> Key: ARROW-8435
> URL: https://issues.apache.org/jira/browse/ARROW-8435
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.15.1
> Reporter: Shawn Li
> Priority: Critical
>
> This issue occurs when a STS token expires *in the middle of* writing to S3.
> An OSError: Write failed: TypeError("'NoneType' object is not
> subscriptable",) is raised instead of a PermissionError.
>
> OSError: Write failed: TypeError("'NoneType' object is not subscriptable",)
> Traceback (most recent call last):
> File "/usr/local/lib/python3.6/site-packages/pyarrow/parquet.py", line 1450,
> in
> write_to_dataset write_table(subtable, f, **kwargs)
> File "/usr/local/lib/python3.6/site-packages/pyarrow/parquet.py", line 1344,
> in
> write_table writer.write_table(table, row_group_size=row_group_size)
> File "/usr/local/lib/python3.6/site-packages/pyarrow/parquet.py", line 474,
> in
> write_table self.writer.write_table(table, row_group_size=row_group_size)
> File "pyarrow/_parquet.pyx", line 1375, in
> pyarrow._parquet.ParquetWriter.write_table File "pyarrow/error.pxi", line 80,
> in
> pyarrow.lib.check_statuspyarrow.lib.ArrowIOError: Arrow error: IOError: The
> provided token has expired.. Detail: Python exception: PermissionError
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
> File "/usr/local/lib/python3.6/site-packages/s3fs/core.py", line 1096, in
> _upload_chunk PartNumber=part, UploadId=self.mpu['UploadId'],TypeError:
> 'NoneType' object is not subscriptable
> environment is:
> s3fs==0.4.0
> boto3==1.10.27
> botocore==1.13.27
> pyarrow==0.15.1
--
This message was sent by Atlassian Jira
(v8.3.4#803005)