Greetings everyone,
I'm experimenting a lot with UDF's utilizing Neural Network inference, mainly for classification of tweets. Problem is, running the UDF's in a one-at-a-time fashion severely under-exploits the capacity of GPU-powered NN's, as well as there being a certain latency associated with moving data from the CPU to the GPU and back every time the UDF is called, causing for poor performance. Ideally it would be possible use the UDF to process records in a micro-batch fashion, letting them accumulate until a certain batch-size is reached (as big as my GPU's memory can handle) before passing the data along to the neural network to get the outputs. Is there a way to accomplish this with the current UDF framework (either in java or python)? If not, where would I have to start to develop such a feature? Best wishes, Torsten Bergh Moss
