Scott,

Sorry I did not realize the Hadoop client would be looking for this
winutils.exe when running on Windows.

On linux and MacOS you don't need anything external installed outside
of NiFi so I wasn't expecting this.

Not sure if there is any other good option here regarding Parquet.

Thanks,

Bryan


On Tue, Aug 14, 2018 at 5:31 PM, scott <tcots8...@gmail.com> wrote:
> Hi Bryan,
> I'm fine if I have to trick the API, but don't I still need Hadoop installed
> somewhere? After creating the core-site.xml as you described, I get the
> following errors:
>
> Failed to locate the winutils binary in the hadoop binary path
> IOException: Could not locate executable null\bin\winutils.exe in the Hadoop
> binaries
> Unable to load native-hadoop library for your platform... using builtin-java
> classes where applicable
> Failed to write due to java.io.IOException: No FileSystem for scheme
>
> BTW, I'm using NiFi version 1.5
>
> Thanks,
> Scott
>
>
> On Tue, Aug 14, 2018 at 12:44 PM, Bryan Bende <bbe...@gmail.com> wrote:
>>
>> Scott,
>>
>> Unfortunately the Parquet API itself is tied to the Hadoop Filesystem
>> object which is why NiFi can't read and write Parquet directly to flow
>> files (i.e. they don't provide a way to read/write to/from Java input
>> and output streams).
>>
>> The best you can do is trick the Hadoop API into using the local
>> file-system by creating a core-site.xml with the following:
>>
>> <configuration>
>>     <property>
>>         <name>fs.defaultFS</name>
>>         <value>file:///</value>
>>     </property>
>> </configuration>
>>
>> That will make PutParquet or FetchParquet work with your local
>> file-system.
>>
>> Thanks,
>>
>> Bryan
>>
>>
>> On Tue, Aug 14, 2018 at 3:22 PM, scott <tcots8...@gmail.com> wrote:
>> > Hello NiFi community,
>> > Is there a simple way to read CSV files and write them out as Parquet
>> > files
>> > without Hadoop? I run NiFi on Windows and don't have access to a Hadoop
>> > environment. I'm trying to write the output of my ETL in a compressed
>> > and
>> > still query-able format. Is there something I should be using instead of
>> > Parquet?
>> >
>> > Thanks for your time,
>> > Scott
>
>

Reply via email to