Hi All, I’m opening up a DISCUSS thread to propose a new direction for our DISTILL product, which is presently deprecated and unmaintained. Please feel free to offer different proposals, constructive criticism, and opinions.
About Apache DISTILL: http://flagon.incubator.apache.org/distill/ <http://flagon.incubator.apache.org/distill/> Context: UserALE.js is and will likely always be our ‘flagship” product. Its the enabling technology that makes behavioral logging easy to deploy, control, and useful for business and analytical use-cases. Distill, however, was conceived of the analytical framework in which we can really show off what UserALE can do. In Distill, we thought to add the high value content that would really discriminate us in the market and drive both demand and community growth. Distill 0.1.0 was envisioned as a stack component that would house analytical libraries for users to call from their own python environments, or from a front-end visualization client (e.g., TAP). Due to development priorities in the original SensSoft (now Flagon) team, Distill took a backseat to instrumentation work and front-end work. In its current state, it is strongly tied to TAP, which is also deprecated. Changes in the original SensSoft team make it unlikely that we will be able to revive Distill in its original product vision (although we have very good requirements for Distill v0.2.0: https://cwiki.apache.org/confluence/display/FLAGON/Distill+0.2.0 <https://cwiki.apache.org/confluence/display/FLAGON/Distill+0.2.0>). However, were we do revive Distill, as is, we might be repeating the same mistakes—rather than focus on the analytical content that will drive adoption and community growth, we will be distracted to working on infrastructure to pull that content into a multi-purpose stack component. Proposal: My proposal is to refactor Distill to expedite analytical content development. This involves pulling back the product focus from a server-side stack component to a Python package that users can pull into their own Python and ‘Conda package. We can then provide some of the basic functions (e.g., make elastic queries, aggregations) Distill offered through third-party dependencies (e.g., elasticsearch-dsl), and quickly begin generating analytical libraries to distribute through the package. Depending on demand, we can revisit the concept of a server-side stack component serving various front-ends, but capitalize on other streamlined visualization packages like PlotLy for visualization. This will require major repository restructuring. However, this lessons the work needed to get to analytical content generation and dissemination and expedites community growth. Please share your thoughts. As we reach consensus, we can move to a VOTE to steer toward this proposal or others that arise through Discussion. Thanks, Josh