Hi: I need to migrate a Log Analysis System from mysql + some C++ real time computer framwork to Hadoop ecosystem.
When I want to build a data warehouse. don't know which one is the right choice. Cassandra? HIVE? Or just SparkSQL ? There is few benchmark for these systems. My scenario as below: Every 5 seconds, flume will translate a log file from IDC. The log file is pre-format to adapt Mysql Load event。 There is many IDCs,and will close down OR reconnect to the flume random. Every online IDC must receive analyse of their LOG every 5mins Any Suggestion? Thanks Yours Meng