If I install a Hadoop client on my NiFi host, would I be able to get past this error? I don't understand why this processor depends on Hadoop. Other projects like Drill and Spark don't have such a dependency to be able to write Parquet files.
On Tue, Aug 14, 2018 at 2:58 PM, Juan Pablo Gardella < gardellajuanpa...@gmail.com> wrote: > It's a warning. You can ignore that. > > On Tue, 14 Aug 2018 at 18:53 Bryan Bende <bbe...@gmail.com> wrote: > >> Scott, >> >> Sorry I did not realize the Hadoop client would be looking for this >> winutils.exe when running on Windows. >> >> On linux and MacOS you don't need anything external installed outside >> of NiFi so I wasn't expecting this. >> >> Not sure if there is any other good option here regarding Parquet. >> >> Thanks, >> >> Bryan >> >> >> On Tue, Aug 14, 2018 at 5:31 PM, scott <tcots8...@gmail.com> wrote: >> > Hi Bryan, >> > I'm fine if I have to trick the API, but don't I still need Hadoop >> installed >> > somewhere? After creating the core-site.xml as you described, I get the >> > following errors: >> > >> > Failed to locate the winutils binary in the hadoop binary path >> > IOException: Could not locate executable null\bin\winutils.exe in the >> Hadoop >> > binaries >> > Unable to load native-hadoop library for your platform... using >> builtin-java >> > classes where applicable >> > Failed to write due to java.io.IOException: No FileSystem for scheme >> > >> > BTW, I'm using NiFi version 1.5 >> > >> > Thanks, >> > Scott >> > >> > >> > On Tue, Aug 14, 2018 at 12:44 PM, Bryan Bende <bbe...@gmail.com> wrote: >> >> >> >> Scott, >> >> >> >> Unfortunately the Parquet API itself is tied to the Hadoop Filesystem >> >> object which is why NiFi can't read and write Parquet directly to flow >> >> files (i.e. they don't provide a way to read/write to/from Java input >> >> and output streams). >> >> >> >> The best you can do is trick the Hadoop API into using the local >> >> file-system by creating a core-site.xml with the following: >> >> >> >> <configuration> >> >> <property> >> >> <name>fs.defaultFS</name> >> >> <value>file:///</value> >> >> </property> >> >> </configuration> >> >> >> >> That will make PutParquet or FetchParquet work with your local >> >> file-system. >> >> >> >> Thanks, >> >> >> >> Bryan >> >> >> >> >> >> On Tue, Aug 14, 2018 at 3:22 PM, scott <tcots8...@gmail.com> wrote: >> >> > Hello NiFi community, >> >> > Is there a simple way to read CSV files and write them out as Parquet >> >> > files >> >> > without Hadoop? I run NiFi on Windows and don't have access to a >> Hadoop >> >> > environment. I'm trying to write the output of my ETL in a compressed >> >> > and >> >> > still query-able format. Is there something I should be using >> instead of >> >> > Parquet? >> >> > >> >> > Thanks for your time, >> >> > Scott >> > >> > >> >