Re: Simple CSV to Parquet without Hadoop

scott Tue, 14 Aug 2018 14:32:12 -0700

Hi Bryan,
I'm fine if I have to trick the API, but don't I still need Hadoop
installed somewhere? After creating the core-site.xml as you described, I
get the following errors:


Failed to locate the winutils binary in the hadoop binary path
IOException: Could not locate executable null\bin\winutils.exe in the
Hadoop binaries
Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable
Failed to write due to java.io.IOException: No FileSystem for scheme

BTW, I'm using NiFi version 1.5

Thanks,
Scott


On Tue, Aug 14, 2018 at 12:44 PM, Bryan Bende <bbe...@gmail.com> wrote:

> Scott,
>
> Unfortunately the Parquet API itself is tied to the Hadoop Filesystem
> object which is why NiFi can't read and write Parquet directly to flow
> files (i.e. they don't provide a way to read/write to/from Java input
> and output streams).
>
> The best you can do is trick the Hadoop API into using the local
> file-system by creating a core-site.xml with the following:
>
> <configuration>
>     <property>
>         <name>fs.defaultFS</name>
>         <value>file:///</value>
>     </property>
> </configuration>
>
> That will make PutParquet or FetchParquet work with your local file-system.
>
> Thanks,
>
> Bryan
>
>
> On Tue, Aug 14, 2018 at 3:22 PM, scott <tcots8...@gmail.com> wrote:
> > Hello NiFi community,
> > Is there a simple way to read CSV files and write them out as Parquet
> files
> > without Hadoop? I run NiFi on Windows and don't have access to a Hadoop
> > environment. I'm trying to write the output of my ETL in a compressed and
> > still query-able format. Is there something I should be using instead of
> > Parquet?
> >
> > Thanks for your time,
> > Scott
>

Re: Simple CSV to Parquet without Hadoop

Reply via email to