umehrot2 commented on issue #1981: URL: https://github.com/apache/hudi/issues/1981#issuecomment-678000764
@rubenssoto until this is fixed would you been okay querying through `spark-sql` instead ? Since you are using COW, you can make your spark-sql queries use spark's listing mechanism and just pass the Hoodie path filter to it. I think this is going to give you better query performance. Here is how you should start `spark-sql`: ``` spark-sql --conf "spark.serializer=org.apache.spark.serializer.KryoSerializer" --conf "spark.hadoop.mapreduce.input.pathFilter.class=org.apache.hudi.hadoop.HoodieROTablePathFilter" --jars /usr/lib/hudi/hudi-spark-bundle.jar,/usr/lib/spark/external/lib/spark-avro.jar ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
