[GitHub] [beam] TheNeuralBit commented on a change in pull request #14117: [BEAM-7929] Support column projection for Parquet Tables

GitBox Mon, 01 Mar 2021 16:25:12 -0800


TheNeuralBit commented on a change in pull request #14117:
URL: https://github.com/apache/beam/pull/14117#discussion_r585146136




##########
File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/parquet/ParquetTableProvider.java
##########
@@ -39,19 +36,18 @@
  *   favorite_numbers ARRAY<INTEGER>
  * )
  * TYPE 'parquet'
- * LOCATION '/home/admin/users.parquet'
+ * LOCATION '/home/admin/orders/'

Review comment:
       Thanks for updating this, I guess the original version won't work since 
we always add a `/*` to the end for reads. This should probably just padd 
through the location directly instead, so the user can specify a glob if they 
want. Another follow-on jira I suppose.

##########
File path: sdks/java/extensions/sql/build.gradle
##########
@@ -79,6 +79,7 @@ dependencies {
   provided project(":sdks:java:io:kafka")
   provided project(":sdks:java:io:google-cloud-platform")
   compile project(":sdks:java:io:mongodb")
+  compile library.java.avro

Review comment:
       I suppose this is necessary because of the direct references to 
`org.apache.avro`? I think if we can push all that complexity into ParquetIO it 
won't be necessary, right?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] TheNeuralBit commented on a change in pull request #14117: [BEAM-7929] Support column projection for Parquet Tables

Reply via email to