Hi, thanks for your reply! I will try to be a bit more precise :-) I am currently testing the decoupled framework, and I would like to use data from another dataset when enriching tweets, here being data from the RankingResult dataset. Additionally, I would like to send the incoming tweet, as well as a record from RankingResult (say with id = 1) to a Java UDF (from within the SQL++ UDF) for more complex processing, like clustering the tweets, and scoring them based on how relevant they are for a given topic. The scoring within the Java UDF requires information about the record stored in RankingResult.
Applying the SQL++ UDF to a TwitterFeed, I aim to check whether a tweet is scored higher than the tweets found in the RankingList record (containing the top ranked tweets for the given topic). I see now that I could select the record I wish to use by "SELECT VALUE r FROM RankingResult r where id=1". One can think of the RankingResult dataset to hold one record per topic/user query which I want to find the top k most relevant tweets for. The overall goal of the project is to see if AsterixDB can be used to continuously rank tweets in real-time with respect to a user-defined query, meaning that the RankingResult record for the given user query should be updated continuously. I am however also looking into creating a TreeMap data structure in the Java UDF to hold the top current tweets and their scores, and use this for deciding whether the incoming tweet should switch place with any of the top ranked tweets. However, I would like to update the RankingResult record in order to make the data queryable. Thanks in advance, Sandra On 2019/04/25 22:10:56, Mike Carey <[email protected]> wrote: > I will let someone else chime in on what the compilation error might be > about, but approach 1 has the problem that you rightly tried to correct > in approach 2 (because SELECT always returns an array of results). But > - could you say a bit more - up 5000 feet - about the use case you are > trying to address...? It's not clear (to me) why one might want to have > a single-item dataset - perhaps that's just a part of your > trying-to-make-this-work debugging - but it might help if the group > could see what you are trying to do overall. (E.g., if you just want to > process incoming records on a feed, you wouldn't need another dataset > for that. What's the more general picture/desire?) > > Cheers, > > Mike > > On 4/25/19 12:08 AM, [email protected] wrote: > > Hi devs! > > > > Given a datatype RankingResultType and a dataset > > RankingResult(RankingResultType) which contains only one record, what is > > the correct approach when I want to pass a single RankingResult record as > > an argument to a Java UDF in a SQL++ UDF? The resulting record of the Java > > UDF should be selected at the end of the UDF as it is going to be stored in > > the dataset the feed which uses the SQL++ UDF is attached to. > > > > CREATE FUNCTION rank(newItem) { > > LET rankingResult = *must select the record here*, > > SELECT testlib#detectRelevance(newItem, *must pass RankingResult record > > here*) > > }; > > > > I have tried some different approaches, for instance > > 1. running LET rankingResult = (SELECT VALUE r FROM RankingResult r) > > SELECT testlib#detectRelevance(newItem, rankingResult) > > 2. running LET rankingResult = (SELECT VALUE r FROM RankingResult r)[0] > > SELECT testlib#detectRelevance(newItem, rankingResult) > > > > The first approach throws a TypeMismatchException, ASX1002: Type mismatch: > > function testlib#detectRelevance expects its 2nd input parameter to be of > > type object, but the actual input type is array > > > > So I therefore tried to access the first element of the array in the second > > approach, but the second approach does not compile: > > SX1079: Compilation error: The input type union(RankingResultType: closed { > > id: bigint, > > first: RankingType: open { score: double }, > > second: RankingType: open {score: double}, > > third: RankingType: open { score: double}, > > fourth: RankingType: open {score: double}, > > fifth: RankingType: open {score: double} > > } , null, missing) is not a valid record type! > > > > Could you maybe point me in the right direction? > > Thanks in advance! > > > > Best, > > Sandra > > >
