kbendick commented on a change in pull request #2920: URL: https://github.com/apache/iceberg/pull/2920#discussion_r685755105
########## File path: site/docs/flink.md ########## @@ -44,8 +44,11 @@ To create iceberg table in flink, we recommend to use [Flink SQL Client](https:/ Step.1 Downloading the flink 1.11.x binary package from the apache flink [download page](https://flink.apache.org/downloads.html). We now use scala 2.12 to archive the apache iceberg-flink-runtime jar, so it's recommended to use flink 1.11 bundled with scala 2.12. ```bash -wget https://downloads.apache.org/flink/flink-1.11.1/flink-1.11.1-bin-scala_2.12.tgz -tar xzvf flink-1.11.1-bin-scala_2.12.tgz +FLINK_VERSION=1.11.1 +SCALA_VERSION=2.12 +APACHE_URL=https://downloads.apache.org Review comment: Nit / non-blocking: Have you tried running this script? Maybe you have cached dependencies, but I don't think Flink 1.11.1 is available at that URL anymore. One concern I have here is that this will only work for the few packages kept in the latest apache mirrors. Currently, for Flink the main `downloads` URL has the latest Flink 1.11, 1.12, and 1.13 (i.e. 1.11.4, 1.12.5, and 1.13.2). As you can see, this gets dated pretty quickly as the main `downloads` page is _only_ intended for the active latest major / minor versions (usually 2-3 major versions at a time, with only the latest minor version for each). This URL only has Flink 1.11.4. I know this PR doesn't update the existing behavior, but arguably it gives the impression that this script can be used by simply changing the Flink version etc and that's not really the case. The archives seem to have all versions, in addition to the current latest versions (from the current URL). The latest versions were pushed to the archive on the same day as they were pushed to the downloads link, so it appears that they're in sync. Possibly it's better to instruct users to go to the archives instead? https://archive.apache.org/dist/flink/ In any case, if you can please ensure that you've run this script and that it works, that would be great. ########## File path: site/docs/flink.md ########## @@ -78,12 +81,25 @@ as the following: # HADOOP_HOME is your hadoop root directory after unpack the binary package. export HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath` +# download Iceberg dependency +ICEBERG_VERSION=0.11.1 +MAVEN_URL=https://repo1.maven.org/maven2 +ICEBERG_MAVEN_URL=$MAVEN_URL/org/apache/iceberg +ICEBERG_PACKAGE=iceberg-flink-runtime +wget $ICEBERG_MAVEN_URL/$ICEBERG_PACKAGE/$ICEBERG_VERSION/$ICEBERG_PACKAGE-$ICEBERG_VERSION.jar + # wget the flink-sql-connector-hive-2.3.6_2.11-1.11.0.jar from the above bundled jar URL firstly. +HIVE_VERSION=2.3.6 +SCALA_VERSION=2.11 +FLINK_VERSION=1.11.0 +FLINK_MAVEN_URL=$MAVEN_URL/org/apache/flink +FLINK_CONNECTOR_PACKAGE=flink-sql-connector-hive +wget $FLINK_MAVEN_URL/$FLINK_CONNECTOR_PACKAGE-$HIVE_VERSION\_$SCALA_VERSION/$FLINK_VERSION/$FLINK_CONNECTOR_PACKAGE-$HIVE_VERSION\_$SCALA_VERSION-$FLINK_VERSION.jar Review comment: Nit: Why is the `_` before `$SCALA_VERSION` escaped with a backslash? If it's to ensure that environment variables are properly expanded, would it be possible to use `${}` to specify the environment variables to avoid having to add the backslash prior to the underscore? Something like this might work: `${FLINK_CONNETO_PACKAGE}-${HIVE_VERSION}_${SCALA_VERSION}/${FLINK_VERSION}`. For me, the backslash prior to the underscore makes this much harder to read. ########## File path: site/docs/flink.md ########## @@ -78,12 +81,25 @@ as the following: # HADOOP_HOME is your hadoop root directory after unpack the binary package. export HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath` +# download Iceberg dependency +ICEBERG_VERSION=0.11.1 +MAVEN_URL=https://repo1.maven.org/maven2 +ICEBERG_MAVEN_URL=$MAVEN_URL/org/apache/iceberg +ICEBERG_PACKAGE=iceberg-flink-runtime +wget $ICEBERG_MAVEN_URL/$ICEBERG_PACKAGE/$ICEBERG_VERSION/$ICEBERG_PACKAGE-$ICEBERG_VERSION.jar + # wget the flink-sql-connector-hive-2.3.6_2.11-1.11.0.jar from the above bundled jar URL firstly. Review comment: Nit: Can we not use the hard coded file names in the jar if we're making all of that into arguments? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
