Re: Simple CSV to Parquet without Hadoop

Bryan Bende Tue, 14 Aug 2018 12:44:11 -0700

Scott,

Unfortunately the Parquet API itself is tied to the Hadoop Filesystem
object which is why NiFi can't read and write Parquet directly to flow
files (i.e. they don't provide a way to read/write to/from Java input
and output streams).


The best you can do is trick the Hadoop API into using the local
file-system by creating a core-site.xml with the following:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>file:///</value>
    </property>
</configuration>

That will make PutParquet or FetchParquet work with your local file-system.

Thanks,

Bryan


On Tue, Aug 14, 2018 at 3:22 PM, scott <tcots8...@gmail.com> wrote:
> Hello NiFi community,
> Is there a simple way to read CSV files and write them out as Parquet files
> without Hadoop? I run NiFi on Windows and don't have access to a Hadoop
> environment. I'm trying to write the output of my ETL in a compressed and
> still query-able format. Is there something I should be using instead of
> Parquet?
>
> Thanks for your time,
> Scott

Re: Simple CSV to Parquet without Hadoop

Reply via email to