The only problem I see is the Halloween problem in case of a self join, hence the need for materialization(not sure if it is possible in this case but definitely possible in general). Other than that, I don't think there is any problem.
Cheers, Abdullah On Dec 8, 2015 11:51 PM, "Mike Carey" <[email protected]> wrote: > (I am still completely not seeing a problem here.) > > On 12/8/15 10:20 PM, abdullah alamoudi wrote: > >> The plan is to mostly use Upsert in the future since we can do some >> optimizations with it that we can't do with an insert. >> We should also support deletes as well and probably allow a mix of the >> three operations within the same feed. This is a work in progress right >> now >> but before I go far, I am stabilizing some other parts of the feeds. >> >> Cheers, >> Abdullah. >> >> >> Amoudi, Abdullah. >> >> On Tue, Dec 8, 2015 at 10:11 PM, Ildar Absalyamov < >> [email protected]> wrote: >> >> Abdullah, >>> >>> OK, now I see what problems it will cause. >>> Kinda related question: could the feed implement “upsert” semantics, that >>> you’ve been working on, instead of “insert” semantics? >>> >>> On Dec 8, 2015, at 21:52, abdullah alamoudi <[email protected]> wrote: >>>> >>>> I think that we probably should restrict feed applied functions somehow >>>> (needs further thoughts and discussions) and I know for sure that we >>>> >>> don't. >>> >>>> As for the case you present, I would imagine that it could be allowed >>>> theoretically but I think everyone sees why it should be disallowed. >>>> >>>> One thing to keep in mind is that we introduce a materialize if the >>>> >>> dataset >>> >>>> was part of an insert pipeline. Now think about how this would work with >>>> >>> a >>> >>>> continuous feed. One choice would be that the feed will materialize all >>>> records to be inserted and once the feed stops, it would start inserting >>>> them but I still think we should not allow it. >>>> >>>> My 2c, >>>> Any opposing argument? >>>> >>>> >>>> Amoudi, Abdullah. >>>> >>>> On Tue, Dec 8, 2015 at 6:28 PM, Ildar Absalyamov < >>>> >>> [email protected] >>> >>>> wrote: >>>>> Hi All, >>>>> >>>>> As a part of feed ingestion we do allow preprocessing incoming data >>>>> with >>>>> AQL UDFs. >>>>> I was wondering if we somehow restrict the kind of UDFs that could be >>>>> used? Do we allow joins in these UDFs? Especially joins with the same >>>>> dataset, which is used for intake. Ex: >>>>> >>>>> create type TweetType as open { >>>>> id: string, >>>>> username : string, >>>>> location : string, >>>>> text : string, >>>>> timestamp : string >>>>> } >>>>> create dataset Tweets(TweetType) >>>>> primary key id; >>>>> create function feed_processor($x) { >>>>> for $y in dataset Tweets >>>>> // self-join with Tweets dataset on some predicate($x, $y) >>>>> return $y >>>>> } >>>>> create feed TweetFeed >>>>> apply function feed_processor; >>>>> >>>>> The query above fails in runtime, but I was wondering if that >>>>> theoretically could work at all. >>>>> >>>>> Best regards, >>>>> Ildar >>>>> >>>>> >>>>> Best regards, >>> Ildar >>> >>> >>> >
