Hi Pieter,
Happy new year!
I believe that always depends on a lot of factors and applies to any kind of
visualization problem with big amounts of data:
* How fast do you need the visualisations available?
* How up-to-date do they need to be?
* How complex?
* How beautiful/custom modified?
* How familiar are you with these frameworks? (could be a reason not to use
a lib if they are otherwise equal in capabilities)
It sounds like you want to create a simple histogram across the full history of
stored data. So I’ll throw in another option, that is commonly used for such
use cases:
* Zeppelin notebook:
* Access data stored in HDFS via Hive.
* A bit of preparation in Hive is required (and can be scheduled), e.g.
creating external tables and converting data into a more efficient format, such
as ORC.
Best,
Stefan
From: Pieter Baele <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Wednesday, 2. January 2019 at 07:50
To: "[email protected]" <[email protected]>
Subject: Graphs based on Metron or PCAP data
Hi,
(and good New Year to all as well!)
What would you consider as the easiest approach to create a Graph based
primarly on ip_dst and ip_src adresses and the number (of connections) of those?
I was thinking:
- graph functionality in Elastic stack, but limited (ex only recent data in 1
index?)
- interfacing with Neo4J
- GraphX using Spark?
- using R on data stored in HDFS?
- using Python: plotly? Pandas?
Sincerely
Pieter