Hi All, We are going to implement Client IP based Geo-location Graph in API Manager Analytics. When we go through the ways of doing in [1] , we selected [2] as the most suitable way to do.
*Overview of max-mind's DB.* As the structure of the db (attached in image), They have two tables which incorporate to get the location. Find geoname_id according to network and get Country,City from locations table. *Limitations* As their database dump we couldn't directly process the ip from those tables. We need to check the given ip is in between the network min and max ip. This query get some long time (10 seconds in indexed data). If we directly do this from spark script for each and every ip which in summary table (regardless if ip is same from two row data) will query from the tables. Therefore this will incur the performance impact on this graph. *Solution* 1. Implement LRU cache against ip address vs location. This will need to implement on custom UDF in Spark. If ip querying from spark available in cache it will give the location from it , IF it is not It will retrieve from DB and put into the cache. 2. Persist in a Table ip as the primary key and Country and city as other columns and retrieve data from that table. Please feel free to give us the most suitable way of doing this solution?. [1] - Implementing Geographical based Analytics in API Manager mail thread. [2] - http://dev.maxmind.com/geoip/geoip2/geolite2/ *Thanks* *Tharindu Dharmarathna* Associate Software Engineer WSO2 Inc.; http://wso2.com lean.enterprise.middleware mobile: *+94779109091*
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
