abhioncbr commented on PR #12956: URL: https://github.com/apache/pinot/pull/12956#issuecomment-2062832376
> Do we use hive libraries explicitly? Which part failed if we do not explicitly include it? These are the imports in the [ORCRecordReader](https://github.com/apache/pinot/blob/master/pinot-plugins/pinot-input-format/pinot-orc/src/main/java/org/apache/pinot/plugin/inputformat/orc/ORCRecordReader.java#L33), which in previous versions of ORC were getting satisfied but not with 1.9.3 version. ```java import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector; import org.apache.hadoop.hive.ql.exec.vector.ColumnVector; import org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector; import org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector; import org.apache.hadoop.hive.ql.exec.vector.ListColumnVector; import org.apache.hadoop.hive.ql.exec.vector.LongColumnVector; import org.apache.hadoop.hive.ql.exec.vector.MapColumnVector; import org.apache.hadoop.hive.ql.exec.vector.StructColumnVector; import org.apache.hadoop.hive.ql.exec.vector.TimestampColumnVector; import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch; ``` Also, I checked the ORC version 1.9.3 using command `mvn dependency:tree`, here is the output ```bash +- org.apache.orc:orc-core:jar:1.9.3:compile [INFO] | +- org.apache.orc:orc-shims:jar:1.9.3:compile [INFO] | | \- org.apache.hadoop:hadoop-client-api:jar:3.3.5:compile [INFO] | +- org.apache.commons:commons-lang3:jar:3.14.0:compile [INFO] | +- io.airlift:aircompressor:jar:0.26:compile [INFO] | +- org.jetbrains:annotations:jar:17.0.0:compile [INFO] | \- org.threeten:threeten-extra:jar:1.7.1:compile ``` VS ORC version 1.5.9 ```bash +- org.apache.orc:orc-core:jar:1.5.9:compile [INFO] | +- org.apache.orc:orc-shims:jar:1.5.9:compile [INFO] | +- com.google.protobuf:protobuf-java:jar:3.25.2:compile [INFO] | +- commons-lang:commons-lang:jar:2.6:compile [INFO] | +- io.airlift:aircompressor:jar:0.26:compile [INFO] | +- javax.xml.bind:jaxb-api:jar:2.3.1:compile [INFO] | | \- javax.activation:javax.activation-api:jar:1.2.0:compile [INFO] | +- org.apache.hive:hive-storage-api:jar:2.8.1:compile [INFO] | \- org.threeten:threeten-extra:jar:1.5.0:compile ``` Clearly, ORC has removed `hive-storage-api` compile dependency. I am not sure it's a good idea to add hive dependency in parent pom or pinot-orc pom but I added in parent because it's easier to manage. Let me know if you feel otherwise. Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
