I think that we probably should restrict feed applied functions somehow
(needs further thoughts and discussions) and I know for sure that we don't.
As for the case you present, I would imagine that it could be allowed
theoretically but I think everyone sees why it should be disallowed.

One thing to keep in mind is that we introduce a materialize if the dataset
was part of an insert pipeline. Now think about how this would work with a
continuous feed. One choice would be that the feed will materialize all
records to be inserted and once the feed stops, it would start inserting
them but I still think we should not allow it.

My 2c,
Any opposing argument?


Amoudi, Abdullah.

On Tue, Dec 8, 2015 at 6:28 PM, Ildar Absalyamov <[email protected]
> wrote:

> Hi All,
>
> As a part of feed ingestion we do allow preprocessing incoming data with
> AQL UDFs.
> I was wondering if we somehow restrict the kind of UDFs that could be
> used? Do we allow joins in these UDFs? Especially joins with the same
> dataset, which is used for intake. Ex:
>
> create type TweetType as open {
>   id: string,
>   username : string,
>   location : string,
>   text : string,
>   timestamp : string
> }
> create dataset Tweets(TweetType)
> primary key id;
> create function feed_processor($x) {
> for $y in dataset Tweets
> // self-join with Tweets dataset on some predicate($x, $y)
> return $y
> }
> create feed TweetFeed
> apply function feed_processor;
>
> The query above fails in runtime, but I was wondering if that
> theoretically could work at all.
>
> Best regards,
> Ildar
>
>

Reply via email to