Hi all, I am writing this email to promote our open-source feature store project ( FeatHub <https://github.com/alibaba/feathub>) that supports using Flink (production-ready) and Spark (not production-ready) to compute real-time / offline features with pythonic declarative feature specifications.
To my best knowledge, this is the most mature open-source project that supports using Flink as the compute engine. And it is also the only project that supports multiple compute engines (e.g. Flink, Spark) wth engine-agonistic feature definition SDK so that you can choose the best compute engine that meets your needs (e.g. throughput vs. latency), without changing your programming code, achieving a similar design goal as Apache Beam. As another killer feature, we recently supported application-level metrics so that you can define metrics (e.g. ratio of values that is null in the last 10 minutes) together with your features and FeatHub can automatically compile/compute/export these metrics to Prometheus. Please feel free to learn more about FeatHub by reading its GitHub main README and doc (https://github.com/alibaba/feathub/tree/master/docs/content). We have also provided multiple demos at https://github.com/flink-extended/feathub-examples so that you can try out FeatHub using docker-compose easily. Cheers, Dong