Hi!
I am trying to understand how to access data stored in a dataset, say the
dataset "UserQueries", from a UDF. Say the intent of the given UDF is similar
to the "WordsInList" UDF created here:
https://github.com/idleft/asterix-udf-template/blob/master/src/main/java/org/apache/asterix/external/library/WordInListFunction.java
The possible pipeline of the system would look like this:
A socket feed is created and started, which listens to incoming data of the
type "UserQuery". I’ve created a user interface which will send data to the
specific socket in ADM format. This data is stored in the dataset
"UserQueries". Then, I wish to access the data in a given record within
"UserQueries" to find the keywords to use in the WordInList UDF. This
function/UDF is then going to be used as a query predicate to filter the
incoming data.
Must the UDF be written in SQL++ format in order to achieve this, or is it
possible to write it in Java? The “Data Ingestion in AsterixDB” article
specifies that the former format is a good fit when the pre-processing of a
record requires the result of a query, and I can’t find any documentation doing
this with a Java UDF.
If the UDF must be written in SQL++ in order to accomplish this, I am thinking
something like this:
create function GetUserQueryKeywords(userId) {
(select q.keywords from UserQueries q
where q.userid = userid
and q.timestamp > current_datetime() - daytime_duration(“PT10”))
};
Could you maybe point me in the right direction of how to use such query
results as input for a UDF like WordInList, if possible?
Thanks in advance.
Best regards,
Sandra