In my previous work I did data modeling of telecoms networks, including modeling various event log data (including CDR data), and building statistics trees on the event logs for later querying. Take a look at the presentation I gave at graphconnect NY 2013 for some ideas of what we did:
- Slides: http://www.slideshare.net/craigtaverner/modeling-in-telecoms-2013 (especially slides 27-29 & 32-33) - Video: https://vimeo.com/79390660 While you could just import a CDR log as a long chain of events, what you want to do is connect events into category trees, or time trees, or some other graph structure, at load time (the trees should be built while importing the data), leading to the possiblity to write Cypher queries that simply ask pattern questions (match the trees) to get the answers you want. Some obvious examples from the above: - Connect calls from specific phones to 'phone nodes' (do the same for both caller and callee, see slide 32). - If you have large volumes of data, consider intermediate nodes (for example if you always ask about specific phones within time ranges, then make intermediate nodes time-phone nodes, eg. a single node for each phone on each day/date). - A time tree for time range queries (eg. the very long call query above). I cannot comment on 'simbox' because I don't know what that means. Watch the video and see if you get ideas on how to model it yourself, otherwise ask again. On Sat, Oct 29, 2016 at 4:41 PM, Fares <[email protected]> wrote: > Dear all, > > I am trying to use neo4j for anomaly detection in mobile network data > (CDRs). This means that I am trying to detect abnormal customers behavior. > The format of the records may change from company to company but the most > common attributes are: > • Caller and called Identification Number; > • Date and time; > • Type of Service (Voice Call, SMS, etc...) ; > • Duration; • Network access point identifiers; > • Others; > > I am trying to model such data using Neo4j and then use cypher queries to > detect abnormal customers behaviors > Have any one seen or worked with a similar example? > > examples of the scenarios that I am interested in are > 1- a call which is very long > 2- what are the access points which are used by more users compared to the > other access points? > 3- Detect Simbox or interconnect Bypass fraud. How to knows whether the > call is normal call or Simbox? > 4- a phone number (a) which call another phone number (b) more that (x) > times every day? > > Kind regards > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
