[GitHub] [iceberg] openinx opened a new pull request #1423: Flink: add flink-runtime module

GitBox Thu, 03 Sep 2020 21:14:16 -0700


openinx opened a new pull request #1423:
URL: https://github.com/apache/iceberg/pull/1423



   This patch will create a separate flink runtime module named 
`flink-runtime`, it will shade the common dependency jars and archive all flink 
connector related classes into a jar. Now I have the basic verification under 
my localhost as the following: 
   
   1.  Downloading the apache flink 1.11 release binary
   
   ```bash
   wget 
https://www.apache.org/dyn/closer.lua/flink/flink-1.11.1/flink-1.11.1-bin-scala_2.12.tgz
   tar xzvf flink-1.11.1-bin-scala_2.12.tgz
   cd flink-1.11.1
   ```
   
   2. Start the flink cluster with hadoop environment loaded. 
   
   ```bash
   export 
HADOOP_CLASSPATH=`/Users/openinx/software/hadoop-binary/hadoop-2.9.2/bin/hadoop 
classpath`
   ./bin/start-cluster.sh   # start flink cluster
   ```
   
   3. Build the iceberg runtime jar.
   
   
   ```bash
   <apache-iceberg-project-rootdir>/gradlew build -x test
   ```
   
   We will see the runtime jar located like: 
   
   ```bash
    ls -altr 
/Users/openinx/software/apache-iceberg/flink-runtime/build/libs/iceberg-flink-runtime-850a44c.jar
 
   -rw-r--r--  1 openinx  staff  34621584 Sep  4 11:24 
/Users/openinx/software/apache-iceberg/flink-runtime/build/libs/iceberg-flink-runtime-850a44c.jar
   ```
   
   4. Start the flink sql client
   
   ```bash
   # Switch to flink binary root dir
   
   export 
HADOOP_CLASSPATH=`/Users/openinx/software/hadoop-binary/hadoop-2.9.2/bin/hadoop 
classpath`
   ./bin/sql-client.sh \
       embedded \
       -j 
/Users/openinx/software/apache-iceberg/flink-runtime/build/libs/iceberg-flink-runtime-850a44c.jar
 \
       shell
   ```
   
   5. Let's execute few flink sql
   
   ```sql
   Flink SQL> create catalog iceberg_catalog with(
   >   'type'='iceberg',
   >   'catalog-type'='hadoop',
   >   'property-version'='1',
   >   'warehouse'='/Users/openinx/software/flink/build-target/hadoop-warehouse'
   > );
   [INFO] Catalog has been created.
   
   Flink SQL> USE catalog iceberg_catalog;
   
   Flink SQL> CREATE TABLE test (
   >     id bigint,
   >     data string
   > );
   [INFO] Table has been created.
   
   Flink SQL> insert into test select 1, 'hello';
   [INFO] Submitting SQL update statement to the cluster...
   [INFO] Table update statement has been successfully submitted to the cluster:
   Job ID: 458cf238116135db262ec7dbec47f32e
   ```
   
   6. Check the iceberg table:
   
   
   ```bash
   ➜  default git:(master) ✗ pwd     
   /Users/openinx/software/flink/build-target/hadoop-warehouse/default
   ➜  default git:(master) ✗ tree -a
   .
   └── test
       ├── data
       │   ├── .00000-0-bcfe3440-c326-4d17-be1f-ca3056a45376-00001.parquet.crc
       │   └── 00000-0-bcfe3440-c326-4d17-be1f-ca3056a45376-00001.parquet
       └── metadata
           ├── .2220c21a-38ee-4c27-a498-521de74f7eb7-m0.avro.crc
           ├── 
.snap-8091339106931514070-1-2220c21a-38ee-4c27-a498-521de74f7eb7.avro.crc
           ├── .v1.metadata.json.crc
           ├── .v2.metadata.json.crc
           ├── .version-hint.text.crc
           ├── 2220c21a-38ee-4c27-a498-521de74f7eb7-m0.avro
           ├── 
snap-8091339106931514070-1-2220c21a-38ee-4c27-a498-521de74f7eb7.avro
           ├── v1.metadata.json
           ├── v2.metadata.json
           └── version-hint.text
   
   3 directories, 12 files
   ```
   
   It will need still more work to fill the LICENSE and NOTICE in 
`flink-runtime` module,  and I will test more cases to confirm whether it works 
well. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] openinx opened a new pull request #1423: Flink: add flink-runtime module

Reply via email to