Hi All,

We are going to implement Client IP based Geo-location Graph in API Manager
Analytics. When we go through the ways of doing in [1] , we selected [2] as
the most suitable way to do.


*Overview of max-mind's DB.*

As the structure of the db (attached in image), They have two tables which
incorporate to get the location.

Find geoname_id according to network and get Country,City from locations
table.

*Limitations*

As their database dump we couldn't directly process the ip from those
tables. We need to check the given ip is in between the network min and max
ip. This query get some long time (10 seconds in indexed data). If we
directly do this from spark script for each and every ip which in summary
table (regardless if ip is same from two row data) will query from the
tables. Therefore this will incur the performance impact on this graph.

*Solution*

1. Implement LRU cache against ip address vs location.

This will need to implement on custom UDF in Spark. If ip querying from
spark available in cache it will give the location from it , IF it is not
It will retrieve from DB and put into the cache.

2. Persist in a Table

ip as the primary key and Country and city as other columns and retrieve
data from that table.


Please feel free to give us the most suitable way of doing this solution?.

[1] - Implementing Geographical based Analytics in API Manager mail thread.

[2] - http://dev.maxmind.com/geoip/geoip2/geolite2/


*Thanks*

*Tharindu Dharmarathna*
Associate Software Engineer
WSO2 Inc.; http://wso2.com
lean.enterprise.middleware

mobile: *+94779109091*
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to