Hi!

I am trying to understand how to access data stored in a dataset, say the 
dataset "UserQueries", from a UDF. Say the intent of the given UDF is similar 
to the "WordsInList" UDF created here: 
https://github.com/idleft/asterix-udf-template/blob/master/src/main/java/org/apache/asterix/external/library/WordInListFunction.java

The possible pipeline of the system would look like this:
A socket feed is created and started, which listens to incoming data of the 
type "UserQuery". I’ve created a user interface which will send data to the 
specific socket in ADM format. This data is stored in the dataset 
"UserQueries". Then, I wish to access the data in a given record within 
"UserQueries" to find the keywords to use in the WordInList UDF. This 
function/UDF is then going to be used as a query predicate to filter the 
incoming data.

Must the UDF be written in SQL++ format in order to achieve this, or is it 
possible to write it in Java? The “Data Ingestion in AsterixDB” article 
specifies that the former format is a good fit when the pre-processing of a 
record requires the result of a query, and I can’t find any documentation doing 
this with a Java UDF.

If the UDF must be written in SQL++ in order to accomplish this, I am thinking 
something like this:

create function GetUserQueryKeywords(userId) { 
    (select q.keywords from UserQueries q
       where q.userid = userid 
       and q.timestamp > current_datetime() - daytime_duration(“PT10”))
};

Could you maybe point me in the right direction of how to use such query 
results as input for a UDF like  WordInList, if possible?

Thanks in advance.

Best regards,
Sandra

Reply via email to