Initiating the discussion thread proposing a new aggregate function in AsterixDB. *Feature:* aggregate function to infer schema *Details:* This feature introduces schema inference as an SQL++ function directly integrated into AsterixDB. It is the first approach to offer schema inference as a native SQL++ function, allowing users to infer schemas for not only any dataset but also for queries and subqueries. Its output in JSON Schema, the industry standard, produces both human and machine-readable results, suitable for user interpretation or integration into other queries or programs.
Utilizing the template of array_avg() in the Built-in Function and Function collection file the array_schema() was implemented. During self review, a lot of defined aggregate functions for example SerializableAvgAggregateFunction and IntermediateAvgAggregateFunction are not being utilised during array_schema() query. Is it due to different use cases or am I utilising it incorrectly? Are there any resources to understand the functionality of aggregate functions in the implementation? *APE* https://cwiki.apache.org/confluence/display/ASTERIXDB/APE+8%3A+Schema+Inference+Aggregate+Functions