Sumo, True, MapR FS implementation may have compatibility issues. Additionally, things are complicated by a need to bundle some of their proprietary jars which can't be redistributed with NiFi.
We at Hortonworks, have enabled some of our customers to have NiFi and MapR working together before, maybe check with your friendly support engineer for details? Andrew From: Sumanth Chinthagunta <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Wednesday, March 2, 2016 at 9:40 PM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Nifi JSON event storage in HDFS I am exploring to use kite processor to store data into Hadoop. I hope this lets me change storage engine form hdfs to hive to hbase later. Since my Hadoop distribution is MapR, I didn't have full success yet. Sumo Sent from my iPhone On Mar 2, 2016, at 2:54 AM, Mike Harding <[email protected]<mailto:[email protected]>> wrote: Hi Conrad, Thanks for the heads up, I will investigate Apache Drill. I also forgot to mention that I have downstream requirements about which tools the data modellers are comfortable using - they want to use Hive and Spark as the data access engines primarily so the data needs to be persisted in HDFS in a way that it can be easily accessed by these services. But your right - there is multiple ways of doing this and I'm hoping NiFi would help scope/simplify the pipeline design. Cheers, M On 2 March 2016 at 10:38, Conrad Crampton <[email protected]<mailto:[email protected]>> wrote: Hi, I am doing something similar, but having wrestled with Hive data population (not from NiFi) and its performance I am currently looking at Apache Drill as my SQL abstraction layer over my Hadoop cluster (similar size to yours). To this end, I have chosen Avro as my ‘persistence’ format and using a number of processors to get from raw data though mapping attributes to json to avro (via schemas) and ultimately storing in HDFS. Querying this with Drill is a breeze then as the schema is already specified within the data which Drill understands. The schema can also be extended without impacting existing data too. HTH – I’m sure there are a ton of other ways to skin this particular cat though, Conrad From: Mike Harding <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Wednesday, 2 March 2016 at 10:33 To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Nifi JSON event storage in HDFS Hi All, I currently have a small hadoop cluster running with HDFS and Hive. My ultimate goal is to leverage NiFi's ingestion and flow capabilities to store real-time external JSON formatted event data. What I am unclear about is what the best strategy/design is for storing FlowFile data (i.e. JSON events in my case) within HDFS that can then be accessed and analysed in Hive tables. Is much of the design in terms of storage handled in the NiFi flow or do I need to set something up external of NiFi to ensure I can query each JSON formatted event as a record in a Hive log table for example? Any examples or suggestions much appreciated, Thanks, M ***This email originated outside SecureData*** Click here<https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to report this email as spam. SecureData, combating cyber threats ________________________________ The information contained in this message or any of its attachments may be privileged and confidential and intended for the exclusive use of the intended recipient. If you are not the intended recipient any disclosure, reproduction, distribution or other dissemination or use of this communications is strictly prohibited. The views expressed in this email are those of the individual and not necessarily of SecureData Europe Ltd. Any prices quoted are only valid if followed up by a formal written quote. SecureData Europe Limited. Registered in England & Wales 04365896. Registered Address: SecureData House, Hermitage Court, Hermitage Lane, Maidstone, Kent, ME16 9NT
