[GitHub] [iceberg] kbendick commented on a change in pull request #2920: Flink: shell enhancement

GitBox Tue, 10 Aug 2021 00:21:27 -0700


kbendick commented on a change in pull request #2920:
URL: https://github.com/apache/iceberg/pull/2920#discussion_r685755105




##########
File path: site/docs/flink.md
##########
@@ -44,8 +44,11 @@ To create iceberg table in flink, we recommend to use [Flink 
SQL Client](https:/
 Step.1 Downloading the flink 1.11.x binary package from the apache flink 
[download page](https://flink.apache.org/downloads.html). We now use scala 2.12 
to archive the apache iceberg-flink-runtime jar, so it's recommended to use 
flink 1.11 bundled with scala 2.12.
 
 ```bash
-wget 
https://downloads.apache.org/flink/flink-1.11.1/flink-1.11.1-bin-scala_2.12.tgz
-tar xzvf flink-1.11.1-bin-scala_2.12.tgz
+FLINK_VERSION=1.11.1
+SCALA_VERSION=2.12
+APACHE_URL=https://downloads.apache.org

Review comment:
       Nit / non-blocking:
   
   Have you tried running this script? Maybe you have cached dependencies, but 
I don't think Flink 1.11.1 is available at that URL anymore. 
   
   One concern I have here is that this will only work for the few packages 
kept in the latest apache mirrors.
   
   Currently, for Flink the main `downloads` URL has the latest Flink 1.11, 
1.12, and 1.13 (i.e. 1.11.4, 1.12.5, and 1.13.2).
   
   As you can see, this gets dated pretty quickly as the main `downloads` page 
is _only_ intended for the active latest major / minor versions (usually 2-3 
major versions at a time, with only the latest minor version for each). This 
URL only has Flink 1.11.4.
   
   I know this PR doesn't update the existing behavior, but arguably it gives 
the impression that this script can be used by simply changing the Flink 
version etc and that's not really the case.
   
   The archives seem to have all versions, in addition to the current latest 
versions (from the current URL). The latest versions were pushed to the archive 
on the same day as they were pushed to the downloads link, so it appears that 
they're in sync.
   
   Possibly it's better to instruct users to go to the archives instead? 
https://archive.apache.org/dist/flink/
   
   In any case, if you can please ensure that you've run this script and that 
it works, that would be great.
   
   
   
   

##########
File path: site/docs/flink.md
##########
@@ -78,12 +81,25 @@ as the following:
 # HADOOP_HOME is your hadoop root directory after unpack the binary package.
 export HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`
 
+# download Iceberg dependency
+ICEBERG_VERSION=0.11.1
+MAVEN_URL=https://repo1.maven.org/maven2
+ICEBERG_MAVEN_URL=$MAVEN_URL/org/apache/iceberg
+ICEBERG_PACKAGE=iceberg-flink-runtime
+wget 
$ICEBERG_MAVEN_URL/$ICEBERG_PACKAGE/$ICEBERG_VERSION/$ICEBERG_PACKAGE-$ICEBERG_VERSION.jar
+
 # wget the flink-sql-connector-hive-2.3.6_2.11-1.11.0.jar from the above 
bundled jar URL firstly.
+HIVE_VERSION=2.3.6
+SCALA_VERSION=2.11
+FLINK_VERSION=1.11.0
+FLINK_MAVEN_URL=$MAVEN_URL/org/apache/flink
+FLINK_CONNECTOR_PACKAGE=flink-sql-connector-hive
+wget 
$FLINK_MAVEN_URL/$FLINK_CONNECTOR_PACKAGE-$HIVE_VERSION\_$SCALA_VERSION/$FLINK_VERSION/$FLINK_CONNECTOR_PACKAGE-$HIVE_VERSION\_$SCALA_VERSION-$FLINK_VERSION.jar

Review comment:
       Nit: Why is the `_` before `$SCALA_VERSION` escaped with a backslash?
   
   If it's to ensure that environment variables are properly expanded, would it 
be possible to use `${}` to specify the environment variables to avoid having 
to add the backslash prior to the underscore? Something like this might work: 
`${FLINK_CONNETO_PACKAGE}-${HIVE_VERSION}_${SCALA_VERSION}/${FLINK_VERSION}`.
   
   For me, the backslash prior to the underscore makes this much harder to read.

##########
File path: site/docs/flink.md
##########
@@ -78,12 +81,25 @@ as the following:
 # HADOOP_HOME is your hadoop root directory after unpack the binary package.
 export HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`
 
+# download Iceberg dependency
+ICEBERG_VERSION=0.11.1
+MAVEN_URL=https://repo1.maven.org/maven2
+ICEBERG_MAVEN_URL=$MAVEN_URL/org/apache/iceberg
+ICEBERG_PACKAGE=iceberg-flink-runtime
+wget 
$ICEBERG_MAVEN_URL/$ICEBERG_PACKAGE/$ICEBERG_VERSION/$ICEBERG_PACKAGE-$ICEBERG_VERSION.jar
+
 # wget the flink-sql-connector-hive-2.3.6_2.11-1.11.0.jar from the above 
bundled jar URL firstly.

Review comment:
       Nit: Can we not use the hard coded file names in the jar if we're making 
all of that into arguments?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] kbendick commented on a change in pull request #2920: Flink: shell enhancement

Reply via email to