westonpace commented on PR #35513: URL: https://github.com/apache/arrow/pull/35513#issuecomment-1544181944
ibis-substrait and arrow have to agree on function names (and by name I mean both the URI and the name). substrait-io/substrait is the main repository for substrait and defines a standard set of functions that are likely to be interesting to all producers and consumers. I do think first/last belong in that list, however, there may be some discussion to figure out exactly how to classify them (no existing aggregate functions today depend on order so are these aggregate functions or some kind of special new function). So far, to my understanding, functions have fallen into two categories: * Well defined functions These functions are defined in substrait-io/substrait. They have "official" names that are part of the Substrait spec and should be long standing. In this case, agreement is easy, official names should always be preferred. To use an official name it is important that Arrow has a mapping that maps the official name to the Arrow function name. * Arrow-specific functions These functions are available in Arrow, but not defined yet in substrait-io/substrait. Since there is no official name we instead use a special URI `urn:arrow:substrait_simple_extension_function` with the Arrow function name. One current limitation of Arrow-specific functions is that we cannot specify options. However, one benefit is that no mapping is required. The function name in the Substrait plan should always match the Arrow function name exactly. I'm confused by this PR. It is adding an Arrow mapping, but there is no official function (yet). It will "work" if both ibis-substrait & arrow act like it's already in Substrait but I feel this is going to be confusing to people. Anyone looking at the Arrow repo, for example, might think that there really is an official first/last function. I think it's best to use the arrow-specific URI until the official function is adopted. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org