[GitHub] [incubator-sedona] Kimahriman commented on a diff in pull request #653: [SEDONA-135] Consolidate and upgrade Hadoop version

GitBox Mon, 01 Aug 2022 04:06:30 -0700


Kimahriman commented on code in PR #653:
URL: https://github.com/apache/incubator-sedona/pull/653#discussion_r934405311



##########
pom.xml:
##########
@@ -66,6 +66,12 @@
     </properties>
 
     <dependencies>
+        <dependency>
+            <groupId>com.google.guava</groupId>
+            <artifactId>guava</artifactId>
+            <version>${guava.version}</version>
+            <scope>${dependency.scope}</scope>

Review Comment:
   Why does guava need to be added, doesn't Hadoop have this already?



##########
core/pom.xml:
##########
@@ -41,9 +41,15 @@
         <dependency>
             <groupId>org.apache.hadoop</groupId>
             <artifactId>hadoop-minicluster</artifactId>
-            <version>2.7.4</version>
+            <version>${hadoop.version}</version>
             <scope>test</scope>
         </dependency>
+        <dependency>
+          <groupId>org.mockito</groupId>
+          <artifactId>mockito-all</artifactId>
+          <version>1.8.5</version>
+          <scope>compile</scope>

Review Comment:
   Why did this need to be added as an additional dependency? And if it is 
needed I assume this should be `test` scope



##########
.github/workflows/python.yml:
##########
@@ -62,7 +62,19 @@ jobs:
     - env:
         SPARK_VERSION: ${{ matrix.spark }}
         SCALA_VERSION: ${{ matrix.scala }}
-      run: if [ ${SPARK_VERSION:0:1} == "3" ]; then mvn -q clean install 
-DskipTests -Dscala=${SCALA_VERSION:0:4} -Dspark=3.0 -Dgeotools ; else mvn -q 
clean install -DskipTests -Dscala=${SCALA_VERSION:0:4} -Dspark=2.4 -Dgeotools ; 
fi
+      run: |
+        if [ ${SPARK_VERSION:0:1} == "3" ]; \
+        then \
+          if [ ${HADOOP_VERSION:0:1} == "3" ]; \

Review Comment:
   I don't think `HADOOP_VERSION` is set to anything, these are populated in 
the `env` block above. And might not be necessary regarding the comment about 
the new profiles?



##########
pom.xml:
##########
@@ -618,5 +612,33 @@
                 <module>flink</module>
             </modules>
         </profile>
+        <profile>
+            <id>hadoop3.0</id>
+            <activation>
+                <property>
+                    <name>hadoop</name>
+                    <value>3.0</value>
+                </property>
+                <activeByDefault>false</activeByDefault>
+            </activation>
+            <properties>
+                <hadoop.version>3.2.4</hadoop.version>
+                <guava.version>27.0-jre</guava.version>
+            </properties>
+        </profile>
+        <profile>
+            <id>hadoop2.10</id>
+            <activation>
+                <property>
+                    <name>hadoop</name>
+                    <value>2.10</value>
+                </property>
+                <activeByDefault>true</activeByDefault>
+            </activation>
+            <properties>
+                <hadoop.version>2.10.2</hadoop.version>
+                <guava.version>11.0.2</guava.version>
+            </properties>
+        </profile>

Review Comment:
   Do we need profiles for this? It looks like for the CI (except for python), 
Spark 3 uses Hadoop 3 and Spark 2 uses Hadoop 2, can we just set the hadoop 
version in those existing profiles?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-sedona] Kimahriman commented on a diff in pull request #653: [SEDONA-135] Consolidate and upgrade Hadoop version

Reply via email to