[
https://issues.apache.org/jira/browse/FLINK-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15041701#comment-15041701
]
ASF GitHub Bot commented on FLINK-2549:
---------------------------------------
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/1161#issuecomment-162011423
@ChengXiangLi Sorry for letting you wait, I have not forgotten about this
pull request.
There are two things in your comment:
1. Exposing Managed Memory to UDFs (this pull request), which is much
more convenient than going through the implementation of a deeply integrated
operator.
2. Efficiency for APIs like the Table API. The Table API works on managed
memory already, since it sits on top of Flinks join/sort/etc. What you are
hinting at is to have a lower level interface where functions gets the memory
segments, rather than the row objects, and directly works on the memory
segments. That has been a long which of mine as well, but that involves having
a separate type of functions that support working with memory segments. Plus
more, to handle records that are too large to fit into individual segments.
I am still onto point (1). I aimed a bit too high with how I wanted to
abstract that, but will continue.
I would love to see point (2) at some point. If you are eager in driving
point (2), I'd be very happy. We should probably have a chat and get this
designed, as it involves quite a few things (exposing other abstractions,
spanning records, etc).
> Add topK operator for DataSet
> -----------------------------
>
> Key: FLINK-2549
> URL: https://issues.apache.org/jira/browse/FLINK-2549
> Project: Flink
> Issue Type: New Feature
> Components: Core, Java API, Scala API
> Reporter: Chengxiang Li
> Assignee: Chengxiang Li
> Priority: Minor
>
> topK is a common operation for user, it would be great to have it in Flink.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)