Hi,
I am doing something similar, but having wrestled with Hive data population 
(not from NiFi) and its performance I am currently looking at Apache Drill as 
my SQL abstraction layer over my Hadoop cluster (similar size to yours). To 
this end, I have chosen Avro as my ‘persistence’ format and using a number of 
processors to get from raw data though mapping attributes to json to avro (via 
schemas) and ultimately storing in HDFS. Querying this with Drill is a breeze 
then as the schema is already specified within the data which Drill 
understands. The schema can also be extended without impacting existing data 
too.
HTH – I’m sure there are a ton of other ways to skin this particular cat though,
Conrad

From: Mike Harding <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Wednesday, 2 March 2016 at 10:33
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Nifi JSON event storage in HDFS

Hi All,

I currently have a small hadoop cluster running with HDFS and Hive. My ultimate 
goal is to leverage NiFi's ingestion and flow capabilities to store real-time 
external JSON formatted event data.

What I am unclear about is what the best strategy/design is for storing 
FlowFile data (i.e. JSON events in my case) within HDFS that can then be 
accessed and analysed in Hive tables.

Is much of the design in terms of storage handled in the NiFi flow or do I need 
to set something up external of NiFi to ensure I can query each JSON formatted 
event as a record in a Hive log table for example?

Any examples or suggestions much appreciated,

Thanks,
M



***This email originated outside SecureData***

Click here<https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to report 
this email as spam.


SecureData, combating cyber threats
______________________________________________________________________ 
The information contained in this message or any of its attachments may be 
privileged and confidential and intended for the exclusive use of the intended 
recipient. If you are not the intended recipient any disclosure, reproduction, 
distribution or other dissemination or use of this communications is strictly 
prohibited. The views expressed in this email are those of the individual and 
not necessarily of SecureData Europe Ltd. Any prices quoted are only valid if 
followed up by a formal written quote.

SecureData Europe Limited. Registered in England & Wales 04365896. Registered 
Address: SecureData House, Hermitage Court, Hermitage Lane, Maidstone, Kent, 
ME16 9NT

Reply via email to