Hi! If Flink is not happy with a large Hive table data
Currently it is. Hive lookup table (currently implemented just like a filesystem lookup table) cannot look up values with a specific key, so it has to load all data into memory. Did you mean putting the Hive table data into a Kafka/Kinesis and joining > the main stream This is one solution if you'd like to use a streaming join. If you prefer lookup joins you can try storing the data into a JDBC source and use JDBC lookup joins instead. JDBC lookup sources can look up values with specific keys and you can tune cache size by changing the configs here [1]. [1] https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/jdbc/#lookup-cache Jason Yi <93t...@gmail.com> 于2022年2月8日周二 05:33写道: > Hi Caizhi, > > Could you tell me more details about streaming joins that you suggested? > Did you mean putting the Hive table data into a Kafka/Kinesis and joining > the main stream with the hive table data streaming with a very long > watermark? > > In my use case, the hive table is an account dimension table and I wanted > to join an event stream with the account dimension in Flink. I thought a > lookup table source would work for my use case, but I got a performance > problem as mentioned above. > > If there's a good solution, I'm open to it. I just need to confirm if > Flink is not happy with a large Hive table data. > > Jason. > > > On Sun, Feb 6, 2022 at 7:01 PM Caizhi Weng <tsreape...@gmail.com> wrote: > >> Hi! >> >> Each parallelism of the lookup operation will load all data from the >> lookup table source, so you're loading 10GB of data to each parallelism and >> storing them in JVM memory. That is not only slow but also very >> memory-consuming. >> >> Have you tried joining your main stream with the hive table directly >> (that is, using streaming joins instead of lookup joins)? Does that meet >> your need or why do you have to use lookup joins? >> >> Jason Yi <93t...@gmail.com> 于2022年2月5日周六 08:01写道: >> >>> Hello, >>> >>> I created external tables on Hive with data in s3 and wanted to use >>> those tables as a lookup table in Flink. >>> >>> When I used an external table containing a small size of data as a >>> lookup table, Flink quickly loaded the data into TM memory and did a >>> Temporal join to an event stream. But, when I put an external table >>> containing ~10GB of data, Flink took so long to load the data and finally >>> returned a timeout error. (I set the heartbeat.timeout to 200000) >>> >>> Is there a way to make Flink read Hive data faster? Or is this normal? >>> MySQL lookup tables would be recommended when we have a large size of >>> dimension data? >>> >>> Here's the test environment: >>> - 1.14.0 Flink >>> - EMR 6.5 >>> - Hive 3.1.2 installed on EMR >>> - Hive with a default MetaStore on EMR used. (Not MySQL or Glue >>> Metastore) >>> - Parquet source data in s3 for the external table on Hive >>> >>> Below is part of the Flink log produced while loading the Hive table >>> data. Flink seemed to open one parquet file multiple times and moved to >>> another parquet file to open. I wonder if this is normal. Why Flink didn't >>> read data from multiple files in parallel. I'm not sure if this is a >>> problem caused by the default Hive Metastore. >>> >>> ...... >>> 2022-02-04 22:42:54,839 INFO >>> org.apache.flink.table.filesystem.FileSystemLookupFunction [] - >>> Populating lookup join cache >>> 2022-02-04 22:42:54,839 INFO >>> org.apache.flink.table.filesystem.FileSystemLookupFunction [] - >>> Populating lookup join cache >>> 2022-02-04 22:42:55,083 INFO org.apache.hadoop.mapred.FileInputFormat >>> [] - Total input files to process : 12 >>> 2022-02-04 22:42:55,084 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:42:55,096 INFO org.apache.hadoop.mapred.FileInputFormat >>> [] - Total input files to process : 12 >>> 2022-02-04 22:42:55,097 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:42:55,105 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:42:55,116 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:42:55,169 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:42:55,172 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:42:57,782 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:42:57,783 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:42:57,799 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:42:57,801 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:42:57,851 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:42:57,851 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:42:57,853 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:42:57,897 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:42:57,898 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:42:57,899 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:42:57,908 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:42:57,950 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:01,581 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:01,582 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:01,592 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:01,594 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:01,678 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:01,679 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:01,680 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:01,682 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:01,682 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:01,684 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:01,727 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:01,732 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:03,210 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:03,211 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:03,268 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:03,268 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:03,269 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:03,313 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:03,315 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:03,316 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:03,377 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:03,377 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:03,378 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:03,427 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:07,845 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:07,846 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:07,908 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:07,909 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:07,910 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:07,942 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:07,943 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:07,999 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:08,002 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:08,003 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:08,004 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:08,053 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:10,404 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:10,406 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:11,908 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:11,910 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:11,945 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:11,945 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:11,946 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:11,963 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:11,964 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:11,964 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:11,996 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:12,013 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:15,101 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:15,102 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:15,117 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:15,118 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:15,168 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:15,175 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:18,337 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:18,338 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:18,410 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:18,467 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:18,468 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:18,523 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00000-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:18,651 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:18,676 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00001-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:18,722 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00001-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:18,778 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:18,779 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00001-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:18,835 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00001-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:19,903 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:19,904 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00001-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:19,952 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00001-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:19,952 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:19,953 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00001-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:19,996 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00001-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:23,308 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:23,309 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00001-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:23,363 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00001-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:23,364 INFO >>> org.apache.flink.connectors.hive.read.HiveTableInputFormat [] - Use >>> flink parquet ColumnarRowData reader. >>> 2022-02-04 22:43:23,364 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00001-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> 2022-02-04 22:43:23,406 INFO >>> com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem [] - Opening >>> 's3://bucket/path/to/files/part-00001-55f0ff62-bf83-4eac-8ce8-308bd9efda24-c000.snappy.parquet' >>> for reading >>> ...... >>> >>