Hi, I am doing something similar, but having wrestled with Hive data population (not from NiFi) and its performance I am currently looking at Apache Drill as my SQL abstraction layer over my Hadoop cluster (similar size to yours). To this end, I have chosen Avro as my ‘persistence’ format and using a number of processors to get from raw data though mapping attributes to json to avro (via schemas) and ultimately storing in HDFS. Querying this with Drill is a breeze then as the schema is already specified within the data which Drill understands. The schema can also be extended without impacting existing data too. HTH – I’m sure there are a ton of other ways to skin this particular cat though, Conrad
From: Mike Harding <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Wednesday, 2 March 2016 at 10:33 To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Nifi JSON event storage in HDFS Hi All, I currently have a small hadoop cluster running with HDFS and Hive. My ultimate goal is to leverage NiFi's ingestion and flow capabilities to store real-time external JSON formatted event data. What I am unclear about is what the best strategy/design is for storing FlowFile data (i.e. JSON events in my case) within HDFS that can then be accessed and analysed in Hive tables. Is much of the design in terms of storage handled in the NiFi flow or do I need to set something up external of NiFi to ensure I can query each JSON formatted event as a record in a Hive log table for example? Any examples or suggestions much appreciated, Thanks, M ***This email originated outside SecureData*** Click here<https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to report this email as spam. SecureData, combating cyber threats ______________________________________________________________________ The information contained in this message or any of its attachments may be privileged and confidential and intended for the exclusive use of the intended recipient. If you are not the intended recipient any disclosure, reproduction, distribution or other dissemination or use of this communications is strictly prohibited. The views expressed in this email are those of the individual and not necessarily of SecureData Europe Ltd. Any prices quoted are only valid if followed up by a formal written quote. SecureData Europe Limited. Registered in England & Wales 04365896. Registered Address: SecureData House, Hermitage Court, Hermitage Lane, Maidstone, Kent, ME16 9NT
