sweb commented on pull request #9428: URL: https://github.com/apache/arrow/pull/9428#issuecomment-792942581
> @sweb I can help on Monday. I'm planning to raise the PR for those other regexp functions then can help work through this? Hey @seddonm1 I have rebased my PR on the current master. My plan would be to only keep `regexp_match` and remove `regexp_extract`. However, since there are review comments that I did not address yet, I did not want to remove `regexp_extract` before I was sure that `regexp_match` is the way to go. My main issues are: * What is the correct way to pass the regular expression into the DataFusion function? I have seen in your PR that you treat it as a normal column (ArrayRef) and compile all distinct regular expressions and apply the relevant one. I have to admit that I did not think of this use case, but of course it is much cleaner than my assumption that it is going to be a literal, fingers crossed. However, I am not sure that I would expect this API from a kernel function, i.e. passing the regex as a `StringArray`, instead of a `&str`. * What do you think concerning the usage of `ListArray` in `regexp_match` and just defining the return types of the corresponding DataFusion functions as `List`? I have no experience in using DataFusion (yet) so I am not sure whether this makes sense or if I have to add some kind of handling for accessing list elements to make this usable. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org