Graph enables a number of interesting use cases, and it really depends on what 
you’re after as to which tech makes sense. 

Spark graphx is a strong contender for analytics of things like betweenness and 
community linkage on HDFS indexed data. That would tend to be batch and through 
something like zeppelin. The very latest zeppelin also supports a network 
visualisation method which gives a graph like visual option.

For more interactive, streaming graph and alerting on graph an actual graph 
database makes more sense. I’ve seen some work done around Metron stacks with 
janusgraph, which leans on solr and Hbase so avoids adding too much complexity. 
Janus is not an apache project, but should be includable. At present I’ve only 
seen that used in Metron based distributions rather than Metron core.

Simon 


> On 2 Jan 2019, at 11:59, Otto Fowler <ottobackwa...@gmail.com> wrote:
> 
> Pieter,
> Can you create a jira with your use case?  It is important to capture.  We 
> have some outstanding jira’s around graph support.
> 
> 
>> On January 2, 2019 at 04:40:23, Stefan Kupstaitis-Dunkler 
>> (stefan....@gmail.com) wrote:
>> 
>> Hi Pieter,
>> 
>>  
>> 
>> Happy new year!
>> 
>>  
>> 
>> I believe that always depends on a lot of factors and applies to any kind of 
>> visualization problem with big amounts of data:
>> 
>> How fast do you need the visualisations available?
>> How up-to-date do they need to be?
>> How complex?
>> How beautiful/custom modified?
>> How familiar are you with these frameworks? (could be a reason not to use a 
>> lib if they are otherwise equal in capabilities)
>>  
>> 
>> It sounds like you want to create a simple histogram across the full history 
>> of stored data. So I’ll throw in another option, that is commonly used for 
>> such use cases:
>> 
>> Zeppelin notebook:
>> Access data stored in HDFS via Hive.
>> A bit of preparation in Hive is required (and can be scheduled), e.g. 
>> creating external tables and converting data into a more efficient format, 
>> such as ORC.
>>  
>> 
>> Best,
>> 
>> Stefan
>> 
>>  
>> 
>> From: Pieter Baele <pieter.ba...@gmail.com>
>> Reply-To: "user@metron.apache.org" <user@metron.apache.org>
>> Date: Wednesday, 2. January 2019 at 07:50
>> To: "user@metron.apache.org" <user@metron.apache.org>
>> Subject: Graphs based on Metron or PCAP data
>> 
>>  
>> 
>> Hi,
>> 
>>  
>> 
>> (and good New Year to all as well!)
>> 
>>  
>> 
>> What would you consider as the easiest approach to create a Graph based 
>> primarly on ip_dst and ip_src adresses and the number (of connections) of 
>> those?
>> 
>>  
>> 
>> I was thinking:
>> 
>> - graph functionality in Elastic stack, but limited (ex only recent data in 
>> 1 index?)
>> 
>> - interfacing with Neo4J
>> 
>> - GraphX using Spark?
>> 
>> - using R on data stored in HDFS?
>> 
>> - using Python: plotly? Pandas?
>> 
>>  
>> 
>>  
>> 
>>  
>> 
>> Sincerely
>> 
>> Pieter

Reply via email to