hi folks, As Apache Arrow grows more popular, we may acquire some different kinds of third party developers:
A. Developers who use and, in many cases, contribute to one of the project's reference implementations B. Developers who choose to implement the columnar format themselves, without depending on any reference implementation There's nothing we can do to stop Category B developers, and in some cases building an bespoke implementation may be the correct move. I'm concerned about the case of incomplete implementations that are advertised as "using Arrow", "following the Arrow specification", or "Arrow-compatible". An implementation is considered incomplete if it does not pass the muster of our binary integration test suite (we will eventually need to make this easier to run on third party libraries: https://issues.apache.org/jira/browse/ARROW-6571). If an implementation does not have integration tests to prove compliance, then advertisements regarding its level of compatibility or trueness to the specification may mislead users. Problems that arise from these situations may result in harm to the Arrow community's reputation through no fault of our own. Since we can't force third parties to use any of the Arrow community's code artifacts, one idea is to develop some form of "grading" system to enable projects to self-report the nature of their use of the Arrow columnar format to help answer such questions as: * Do you use a fully integration-tested implementation (e.g. I am only aware of 4 such libraries at the moment -- our reference libraries in C++, Java, JavaScript, and Go -- I understand that C# and Rust will get there eventually)? * If your project "supports Arrow" does that mean just "can serialize data to/from Arrow" or something more? * Does your project feature some level of "native" Arrow-based processing? A linear grading scale may not make sense, but having clear answers to some of these questions in downstream projects' documentation would be helpful. As Apache Arrow's brand grows and value, more and more projects will use the brand in a "Powered By" way, and so I think it's important that we help projects clearly communicate to their users to what extent they employ the project. Thanks, Wes