[
https://issues.apache.org/jira/browse/HUDI-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pratyaksh Sharma updated HUDI-484:
----------------------------------
Description:
When we try to use HiveIncrementalPuller class to incrementally pull changes
from hive, it throws NPE as it is unable to find IncrementalPull.sqltemplate in
the bundled jar.
Screenshot attached which shows the exception.
The jar contains the template.
Steps to reproduce -
# copy hive-jdbc-2.3.1.jar, log4j-1.2.17.jar to docker/demo/config folder
# run cd docker && ./setup_demo.sh
# cat docker/demo/data/batch_1.json | kafkacat -b kafkabroker -t stock_ticks -P
# {{docker exec -it adhoc-2 /bin/bash}}
# {{spark-submit --class
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer
$HUDI_UTILITIES_BUNDLE --storage-type COPY_ON_WRITE --source-class
org.apache.hudi.utilities.sources.JsonKafkaSource --source-ordering-field ts
--target-base-path /user/hive/warehouse/stock_ticks_cow --target-table
stock_ticks_cow --props /var/demo/config/kafka-source.properties
--schemaprovider-class
org.apache.hudi.utilities.schema.FilebasedSchemaProvider}}
# {{/var/hoodie/ws/hudi-hive/run_sync_tool.sh --jdbc-url
jdbc:hive2://hiveserver:10000 --user hive --pass hive --partitioned-by dt
--base-path /user/hive/warehouse/stock_ticks_cow --database default --table
stock_ticks_cow}}
# java -cp
/var/hoodie/ws/docker/demo/config/hive-jdbc-2.3.1.jar:/var/hoodie/ws/docker/demo/config/log4j-1.2.17.jar:$HUDI_UTILITIES_BUNDLE
org.apache.hudi.utilities.HiveIncrementalPuller --hiveUrl
jdbc:hive2://hiveserver:10000 --hiveUser hive --hivePass hive --extractSQLFile
/var/hoodie/ws/docker/demo/config/incr_pull.txt --sourceDb default
--sourceTable stock_ticks_cow --targetDb tmp --targetTable tempTable
--fromCommitTime 0 --maxCommits 1
was:
When we try to use HiveIncrementalPuller class to incrementally pull changes
from hive, it throws NPE as it is unable to find IncrementalPull.sqltemplate in
the bundled jar.
Screenshot attached which shows the exception.
The jar contains the template.
Steps to reproduce -
# copy hive-jdbc-2.3.1.jar, log4j-1.2.17.jar to docker/demo/config folder
# run cd docker && ./setup_demo.sh
# cat docker/demo/data/batch_1.json | kafkacat -b kafkabroker -t stock_ticks -P
# {{docker exec -it adhoc-2 /bin/bash}}
# {{spark-submit --class
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer
$HUDI_UTILITIES_BUNDLE --storage-type COPY_ON_WRITE --source-class
org.apache.hudi.utilities.sources.JsonKafkaSource --source-ordering-field ts
--target-base-path /user/hive/warehouse/stock_ticks_cow --target-table
stock_ticks_cow --props /var/demo/config/kafka-source.properties
--schemaprovider-class
org.apache.hudi.utilities.schema.FilebasedSchemaProvider}}
# {{/var/hoodie/ws/hudi-hive/run_sync_tool.sh --jdbc-url
jdbc:hive2://hiveserver:10000 --user hive --pass hive --partitioned-by dt
--base-path /user/hive/warehouse/stock_ticks_cow --database default --table
stock_ticks_cow}}
# java -cp
/var/hoodie/ws/docker/demo/config/hive-jdbc-2.3.1.jar:/var/hoodie/ws/docker/demo/config/log4j-1.2.17.jar:$HUDI_UTILITIES_BUNDLE
org.apache.hudi.utilities.HiveIncrementalPuller --hiveUrl
jdbc:hive2://hiveserver:10000 --hiveUser hive --hivePass hive --extractSQLFile
/var/hoodie/ws/docker/demo/config/incr_pull.txt --sourceDb default
--sourceTable stock_ticks_cow --targetDb tmp --targetTable tempTable
--fromCommitTime 0 --maxCommits 1{{}}
> NPE in HiveIncrementalPuller
> ----------------------------
>
> Key: HUDI-484
> URL: https://issues.apache.org/jira/browse/HUDI-484
> Project: Apache Hudi (incubating)
> Issue Type: Bug
> Components: Incremental Pull
> Reporter: Pratyaksh Sharma
> Priority: Major
> Fix For: 0.5.1
>
> Attachments: Screenshot 2019-12-30 at 4.43.51 PM.png
>
>
> When we try to use HiveIncrementalPuller class to incrementally pull changes
> from hive, it throws NPE as it is unable to find IncrementalPull.sqltemplate
> in the bundled jar.
> Screenshot attached which shows the exception.
> The jar contains the template.
> Steps to reproduce -
> # copy hive-jdbc-2.3.1.jar, log4j-1.2.17.jar to docker/demo/config folder
> # run cd docker && ./setup_demo.sh
> # cat docker/demo/data/batch_1.json | kafkacat -b kafkabroker -t stock_ticks
> -P
> # {{docker exec -it adhoc-2 /bin/bash}}
> # {{spark-submit --class
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer
> $HUDI_UTILITIES_BUNDLE --storage-type COPY_ON_WRITE --source-class
> org.apache.hudi.utilities.sources.JsonKafkaSource --source-ordering-field ts
> --target-base-path /user/hive/warehouse/stock_ticks_cow --target-table
> stock_ticks_cow --props /var/demo/config/kafka-source.properties
> --schemaprovider-class
> org.apache.hudi.utilities.schema.FilebasedSchemaProvider}}
> # {{/var/hoodie/ws/hudi-hive/run_sync_tool.sh --jdbc-url
> jdbc:hive2://hiveserver:10000 --user hive --pass hive --partitioned-by dt
> --base-path /user/hive/warehouse/stock_ticks_cow --database default --table
> stock_ticks_cow}}
> # java -cp
> /var/hoodie/ws/docker/demo/config/hive-jdbc-2.3.1.jar:/var/hoodie/ws/docker/demo/config/log4j-1.2.17.jar:$HUDI_UTILITIES_BUNDLE
> org.apache.hudi.utilities.HiveIncrementalPuller --hiveUrl
> jdbc:hive2://hiveserver:10000 --hiveUser hive --hivePass hive
> --extractSQLFile /var/hoodie/ws/docker/demo/config/incr_pull.txt --sourceDb
> default --sourceTable stock_ticks_cow --targetDb tmp --targetTable tempTable
> --fromCommitTime 0 --maxCommits 1
--
This message was sent by Atlassian Jira
(v8.3.4#803005)