*Info - *
-- ** We have developed custom UDF named *getBrowserDetails *which returns
json string.

-- We used this UDF to extract *User-Agent* Info like Browser Name, Browser
version, OS name, OS version etc.

-- And this UDF uses *Wurfl *Java API which can be found @
http://www.scientiamobile.com/downloads . While developing Drill UDF we
used wurfl XML file (of size ~*33 MB*).

-- We are using Drill 1.6.0, HDFS and cluster of  3 aws instances of type
m3.2xlarge.

-- Dataset we are operating is of ~60 millions rows  of ~20GB GZ-Compressed
(json data).



*Problem - *
This UDF is working fine on SMALL dataset. But when dataset is *LARGE*,
this is taking too much time. In our case, we haven't get any output yet.



*Questions - *
1. Please suggest a correct way to used external files (xml, csv,
properties etc..) or any other libraries in UDF.
2. Does any one have working/developed similar kind of functions in drill ?

regards,
shankar

Reply via email to