Re: Schema Aggregate Function (VOTE)

2025-08-01 Thread Mike Carey
+1 On 8/1/25 1:44 PM, Calvin Dani wrote: Hi, I believe the discussion on this APE converged and was resolved. I also understand that it is at least partially implemented. Therefore it seems the time has come to vote to adopt this APE. The expedited approval process seems less appropriate due to

Re: Schema Aggregate Function (VOTE)

2025-07-31 Thread Ian Maxon
Can we propose this as a vote, like https://lists.apache.org/thread/2k3mk471rflrnwwq64dtjhy8ydblwb92 ? I think this APE has a similar pattern, where it went through some discussion and revision. I think in those cases, calling for a vote is best, rather than using the expedited process. On Mon, Ju

Re: Schema Aggregate Function (VOTE)

2025-07-28 Thread Shiva Jahangiri
If my vote counts, then +1 from me to push this change forward. On Sun, Jul 27, 2025 at 10:46 AM Mike Carey wrote: > +1 > > (in case I didn't already chime in with that) > > On 7/25/25 9:01 AM, Calvin Dani wrote: > > Hi, > > > > I’ve made the changes to the user model and added an example datase

Re: Schema Aggregate Function (VOTE)

2025-07-27 Thread Mike Carey
+1 (in case I didn't already chime in with that) On 7/25/25 9:01 AM, Calvin Dani wrote: Hi, I’ve made the changes to the user model and added an example dataset and query for the schema inference functions. Open to any further feedback and I’d really appreciate your vote if you think it looks

Re: Schema Aggregate Function

2025-07-25 Thread Calvin Dani
Hi, I’ve made the changes to the user model and added an example dataset and query for the schema inference functions. Open to any further feedback and I’d really appreciate your vote if you think it looks good! Thank you and Regards Calvin Dani On Thu, Feb 6, 2025 at 10:40 AM Mike Carey wrote

Re: Schema Aggregate Function

2025-02-06 Thread Mike Carey
I have put comments on the wiki - some thoughts about the user model, etc. On 2/4/25 7:46 AM, Calvin Dani wrote: Hi, The APE has been updated following the implementation of Query 3, current_schema(), which fetches and aggregates the most recent schema from the LSM components. The updates incl

Re: Schema Aggregate Function

2025-02-04 Thread Calvin Dani
Hi, The APE has been updated following the implementation of Query 3, current_schema(), which fetches and aggregates the most recent schema from the LSM components. The updates include: Syntax of the new query A flowchart illustrating how the query works I’d love to hear your thoughts and sugg

Re: Schema Aggregate Function

2024-12-18 Thread Mike Carey
 Nice! On 12/13/24 4:32 PM, Calvin Dani wrote: Hi, Regarding the performance testing of the first query for schema inference: We benchmarked it against contemporary methods, primarily Spark-based implementations, using a configuration of 2 node controllers and 8 data partitions. For a GitH

Re: Schema Aggregate Function

2024-12-13 Thread Calvin Dani
Hi, Regarding the performance testing of the first query for schema inference: We benchmarked it against contemporary methods, primarily Spark-based implementations, using a configuration of 2 node controllers and 8 data partitions. For a GitHub dataset of 51GB: Our approach inferred the schema

Re: Schema Aggregate Function

2024-12-12 Thread Shiva Jahangiri
Hi Professor, Not sure if the question was for Calvin and I or the dev team, Calvin only evaluated his implementation vs. other implementations compatible with Spark. I asked him to run the same evaluation for the current schema inference of Couchbase and update us with the results. Best, Shiva

Re: Schema Aggregate Function

2024-12-11 Thread Mike Carey
Question - I think you were doing some perf testing - do you have perf results for these (vs. the current schema function)? On 12/5/24 12:04 PM, Calvin Dani wrote: Hi, Wanted to share an update regarding the features in the APE. The two queries: 1. query_schema() 2. collection_schema() are

Re: Schema Aggregate Function

2024-12-05 Thread Calvin Dani
Hi, Wanted to share an update regarding the features in the APE. The two queries: 1. query_schema() 2. collection_schema() are now functional. The query_schema() implementation has been submitted for review. Once that is approved, I will proceed to submit the collection_schema() query, as it de

Re: Schema Aggregate Function

2024-11-06 Thread Calvin Dani
Hi, The APE has been updated with those changes! Regards Calvin Dani On Fri, Nov 1, 2024 at 10:36 AM Mike Carey wrote: > Excellent! +1 > > On Fri, Nov 1, 2024 at 9:35 AM Calvin Dani > wrote: > > > Hi, > > > > Thank you for the feedback and as per last meeting here our the changes > > that ar

Re: Schema Aggregate Function

2024-11-01 Thread Mike Carey
Excellent! +1 On Fri, Nov 1, 2024 at 9:35 AM Calvin Dani wrote: > Hi, > > Thank you for the feedback and as per last meeting here our the changes > that are incorporated to this APE. > They are as follows: > 1. Name of the schema inference functions > 2. Schema inference functionality > > The

Re: Schema Aggregate Function

2024-11-01 Thread Calvin Dani
Hi, Thank you for the feedback and as per last meeting here our the changes that are incorporated to this APE. They are as follows: 1. Name of the schema inference functions 2. Schema inference functionality The summary of changes are as follows : 1. query_schema (Aggregate function that tak

Re: Schema Aggregate Function

2024-10-04 Thread Mike Carey
Great feature!  I wasn't able to understand the query example(s), though...  Could those be cleaned up a little and clarified? Also, I think we might want two functions at the user level - one that takes an expression as input and reports its schema, and another that takes a dataset/collection