Re: Problems with reading ORC files with S3 filesystem

2021-08-17 Thread Piotr Jagielski
Hi David, Thanks for your answer. I finally managed to read ORC files by: - switching to s3a:// in my Flink SQL table path parameter - providing all the properties in Hadoop's core-site.xml file (fs.s3a.endpoint, fs.s3a.path.style.access, fs.s3a.aws.credentials.provider, fs.s3a.access.key,

Re: Problems with reading ORC files with S3 filesystem

2021-08-16 Thread David Morávek
Hi Piotr, unfortunately this is a known long-standing issue [1]. The problem is that ORC format is not using Flink's filesystem abstraction for actual reading of the underlying file, so you have to adjust your Hadoop config accordingly. There is also a corresponding SO question [2] covering this.

Problems with reading ORC files with S3 filesystem

2021-08-14 Thread Piotr Jagielski
Hi, I want to use Flink SQL filesystem to read ORC file via S3 filesystem on Flink 1.13. My table definition looks like this: create or replace table xxx (..., startdate string) partitioned by (startdate) with ('connector'='filesystem', 'format'='orc', 'path'='s3://xxx/orc/yyy') I followed