Hi @Pratyaksh Sharma
Thanks for your steps to reproduce this issue. Try to modify bellow codes, and test again. org.apache.hudi.utilities.HiveIncrementalPuller#HiveIncrementalPuller / --------------------------------- / String templateContent = FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("IncrementalPull.sqltemplate")); Changed to / --------------------------------- / String templateContent = FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("/IncrementalPull.sqltemplate")); best, lamber-ken At 2019-12-30 19:25:08, "Pratyaksh Sharma" <pratyaks...@gmail.com> wrote: >Hi Vinoth, > >I am able to reproduce this error on docker setup and have filed a jira - >https://issues.apache.org/jira/browse/HUDI-484. > >Steps to reproduce are mentioned in the jira description itself. > >On Thu, Dec 26, 2019 at 12:42 PM Pratyaksh Sharma <pratyaks...@gmail.com> >wrote: > >> Hi Vinoth, >> >> I will try to reproduce the error on docker cluster and keep you updated. >> >> On Tue, Dec 24, 2019 at 11:23 PM Vinoth Chandar <vin...@apache.org> wrote: >> >>> Pratyaksh, >>> >>> If you are still having this issue, could you try reproducing this on the >>> docker setup >>> >>> https://hudi.apache.org/docker_demo.html#step-7--incremental-query-for-copy-on-write-table >>> similar to this and raise a JIRA. >>> Happy to look into it and get it fixed if needed >>> >>> Thanks >>> Vinoth >>> >>> On Tue, Dec 24, 2019 at 8:43 AM lamberken <lamber...@163.com> wrote: >>> >>> > >>> > >>> > Hi, @Pratyaksh Sharma >>> > >>> > >>> > The log4j-1.2.17.jar lib also needs to added to the classpath, for >>> example: >>> > java -cp >>> > >>> /path/to/hive-jdbc-2.3.1.jar:/path/to/log4j-1.2.17.jar:packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-0.5.1-SNAPSHOT.jar >>> > org.apache.hudi.utilities.HiveIncrementalPuller --help >>> > >>> > >>> > best, >>> > lamber-ken >>> > >>> > At 2019-12-24 17:23:20, "Pratyaksh Sharma" <pratyaks...@gmail.com> >>> wrote: >>> > >Hi Vinoth, >>> > > >>> > >Sorry my bad, I did not realise earlier that spark is not needed for >>> this >>> > >class. I tried running it with the below command to get the mentioned >>> > >exception - >>> > > >>> > >Command - >>> > > >>> > >java -cp >>> > >>> > >>> >/path/to/hive-jdbc-2.3.1.jar:packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-0.5.1-SNAPSHOT.jar >>> > >org.apache.hudi.utilities.HiveIncrementalPuller --help >>> > > >>> > >Exception - >>> > >Exception in thread "main" java.lang.NoClassDefFoundError: >>> > >org/apache/log4j/LogManager >>> > > at >>> > >>> > >>> >org.apache.hudi.utilities.HiveIncrementalPuller.<clinit>(HiveIncrementalPuller.java:64) >>> > >Caused by: java.lang.ClassNotFoundException: >>> org.apache.log4j.LogManager >>> > > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) >>> > > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>> > > at >>> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) >>> > > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>> > > ... 1 more >>> > > >>> > >I was able to fix it by including the corresponding jar in the bundle. >>> > > >>> > >After fixing the above, still I am getting the NPE even though the >>> > template >>> > >is bundled in the jar. >>> > > >>> > >On Mon, Dec 23, 2019 at 10:45 PM Vinoth Chandar <vin...@apache.org> >>> > wrote: >>> > > >>> > >> Hi Pratyaksh, >>> > >> >>> > >> HveIncrementalPuller is just a java program. Does not need Spark, >>> since >>> > it >>> > >> just runs a HiveQL remotely.. >>> > >> >>> > >> On the error you specified, seems like it can't find the template? >>> Can >>> > you >>> > >> see if the bundle does not have the template file.. May be this got >>> > broken >>> > >> during the bundling changes.. (since its no longer part of the >>> resources >>> > >> folder of the bundle module).. We should also probably be throwing a >>> > better >>> > >> error than NPE.. >>> > >> >>> > >> We can raise a JIRA, once you confirm. >>> > >> >>> > >> String templateContent = >>> > >> >>> > >> >>> > >>> FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("IncrementalPull.sqltemplate")); >>> > >> >>> > >> >>> > >> On Mon, Dec 23, 2019 at 6:02 AM Pratyaksh Sharma < >>> pratyaks...@gmail.com >>> > > >>> > >> wrote: >>> > >> >>> > >> > Hi, >>> > >> > >>> > >> > Can someone guide me or share some documentation regarding how to >>> use >>> > >> > HiveIncrementalPuller. I already went through the documentation on >>> > >> > https://hudi.apache.org/querying_data.html. I tried using this >>> puller >>> > >> > using >>> > >> > the below command and facing the given exception. >>> > >> > >>> > >> > Any leads are appreciated. >>> > >> > >>> > >> > Command - >>> > >> > spark-submit --name incremental-puller --queue etl --files >>> > >> > incremental_sql.txt --master yarn --deploy-mode cluster >>> > --driver-memory >>> > >> 4g >>> > >> > --executor-memory 4g --num-executors 2 --class >>> > >> > org.apache.hudi.utilities.HiveIncrementalPuller >>> > >> > hudi-utilities-bundle-0.5.1-SNAPSHOT.jar --hiveUrl >>> > >> > jdbc:hive2://HOST:PORT/ --hiveUser <user> --hivePass <pass> >>> > >> > --extractSQLFile incremental_sql.txt --sourceDb <source_db> >>> > --sourceTable >>> > >> > <src_table> --targetDb tmp --targetTable tempTable >>> --fromCommitTime 0 >>> > >> > --maxCommits 1 >>> > >> > >>> > >> > Error - >>> > >> > >>> > >> > java.lang.NullPointerException >>> > >> > at >>> org.apache.hudi.common.util.FileIOUtils.copy(FileIOUtils.java:73) >>> > >> > at >>> > >> > >>> > >> > >>> > >> >>> > >>> org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:66) >>> > >> > at >>> > >> > >>> > >> > >>> > >> >>> > >>> org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:61) >>> > >> > at >>> > >> > >>> > >> > >>> > >> >>> > >>> org.apache.hudi.utilities.HiveIncrementalPuller.<init>(HiveIncrementalPuller.java:113) >>> > >> > at >>> > >> > >>> > >> > >>> > >> >>> > >>> org.apache.hudi.utilities.HiveIncrementalPuller.main(HiveIncrementalPuller.java:343) >>> > >> > >>> > >> >>> > >>> >>