Re: [YAML] Aggregations

2023-10-30 Thread Kenneth Knowles
Automatically dereferencing, basically. It is nice. Especially for many-to-many relationships like the example. I don't know if the aggregation is any different though, is it? Kenn On Sun, Oct 29, 2023 at 1:12 PM Robert Burke wrote: > I came across Edge DB, and it has a novel syntax moving

Re: [YAML] Aggregations

2023-10-29 Thread Robert Burke
I came across Edge DB, and it has a novel syntax moving away from SQL with their EdgeQL. https://www.edgedb.com/ Eg. Heere are two equivalent "nested" queries. # EdgeQL select Movie { title, actors: { name }, rating := math::mean(.reviews.score) } filter "Zendaya" in .actors.name;

Re: [YAML] Aggregations

2023-10-23 Thread XQ Hu via dev
+1 on your proposal. On Fri, Oct 20, 2023 at 4:59 PM Robert Bradshaw via dev wrote: > On Fri, Oct 20, 2023 at 11:35 AM Kenneth Knowles wrote: > > > > A couple other bits on having an expression language: > > > > - You already have Python lambdas at places, right? so that's quite a > lot more

Re: [YAML] Aggregations

2023-10-20 Thread Robert Bradshaw via dev
On Fri, Oct 20, 2023 at 11:35 AM Kenneth Knowles wrote: > > A couple other bits on having an expression language: > > - You already have Python lambdas at places, right? so that's quite a lot > more complex than SQL project/aggregate expressions > - It really does save a lot of pain for users

Re: [YAML] Aggregations

2023-10-20 Thread Kenneth Knowles
A couple other bits on having an expression language: - You already have Python lambdas at places, right? so that's quite a lot more complex than SQL project/aggregate expressions - It really does save a lot of pain for users (at the cost of implementation complexity) when you need to

Re: [YAML] Aggregations

2023-10-19 Thread Robert Bradshaw via dev
On Thu, Oct 19, 2023 at 12:53 PM Reuven Lax wrote: > > Is the schema Group transform (in Java) something along these lines? Yes, for sure it is. It (and Python's and Typescript's equivalent) are linked in the original post. The open question is how to best express this in YAML. > On Wed, Oct

Re: [YAML] Aggregations

2023-10-19 Thread Reuven Lax via dev
Is the schema Group transform (in Java) something along these lines? On Wed, Oct 18, 2023 at 1:11 PM Robert Bradshaw via dev wrote: > Beam Yaml has good support for IOs and mappings, but one key missing > feature for even writing a WordCount is the ability to do Aggregations > [1]. While the

Re: [YAML] Aggregations

2023-10-19 Thread Reuven Lax via dev
Or are you specifically referring to the declarative YAML pipelines? On Thu, Oct 19, 2023 at 12:53 PM Reuven Lax wrote: > Is the schema Group transform (in Java) something along these lines? > > On Wed, Oct 18, 2023 at 1:11 PM Robert Bradshaw via dev < > dev@beam.apache.org> wrote: > >> Beam

Re: [YAML] Aggregations

2023-10-19 Thread Robert Bradshaw via dev
On Thu, Oct 19, 2023 at 11:12 AM Kenneth Knowles wrote: > > Using SQL expressions in strings is maybe OK given we are all > relational all the time. Either way you have to define what the > universe of `fn` is. Here's a compact possibility: > > type: Combine > config: > group_by: [field1,

Re: [YAML] Aggregations

2023-10-19 Thread Robert Bradshaw via dev
On Thu, Oct 19, 2023 at 11:42 AM Jan Lukavský wrote: > > On 10/19/23 19:41, Robert Bradshaw via dev wrote: > > On Thu, Oct 19, 2023 at 10:25 AM Jan Lukavský wrote: > >> On 10/19/23 18:28, Robert Bradshaw via dev wrote: > >>> On Thu, Oct 19, 2023 at 9:00 AM Byron Ellis wrote: > Rill is

Re: [YAML] Aggregations

2023-10-19 Thread Jan Lukavský
On 10/19/23 19:41, Robert Bradshaw via dev wrote: On Thu, Oct 19, 2023 at 10:25 AM Jan Lukavský wrote: On 10/19/23 18:28, Robert Bradshaw via dev wrote: On Thu, Oct 19, 2023 at 9:00 AM Byron Ellis wrote: Rill is definitely SQL-oriented but I think that's going to be the most common.

Re: [YAML] Aggregations

2023-10-19 Thread Kenneth Knowles
Using SQL expressions in strings is maybe OK given we are all relational all the time. Either way you have to define what the universe of `fn` is. Here's a compact possibility: type: Combine config: group_by: [field1, field2] aggregates: max_cost: "MAX(cost)" total_cost: "SUM(cost)"

Re: [YAML] Aggregations

2023-10-19 Thread Robert Bradshaw via dev
On Thu, Oct 19, 2023 at 10:25 AM Jan Lukavský wrote: > > On 10/19/23 18:28, Robert Bradshaw via dev wrote: > > On Thu, Oct 19, 2023 at 9:00 AM Byron Ellis wrote: > >> Rill is definitely SQL-oriented but I think that's going to be the most > >> common. Dataframes are explicitly modeled on the

Re: [YAML] Aggregations

2023-10-19 Thread Jan Lukavský
On 10/19/23 18:28, Robert Bradshaw via dev wrote: On Thu, Oct 19, 2023 at 9:00 AM Byron Ellis wrote: Rill is definitely SQL-oriented but I think that's going to be the most common. Dataframes are explicitly modeled on the relational approach so that's going to look a lot like SQL, I think

Re: [YAML] Aggregations

2023-10-19 Thread Byron Ellis via dev
On Thu, Oct 19, 2023 at 9:28 AM Robert Bradshaw wrote: > On Thu, Oct 19, 2023 at 9:00 AM Byron Ellis wrote: > > > > Rill is definitely SQL-oriented but I think that's going to be the most > common. Dataframes are explicitly modeled on the relational approach so > that's going to look a lot like

Re: [YAML] Aggregations

2023-10-19 Thread Robert Bradshaw via dev
On Thu, Oct 19, 2023 at 9:00 AM Byron Ellis wrote: > > Rill is definitely SQL-oriented but I think that's going to be the most > common. Dataframes are explicitly modeled on the relational approach so > that's going to look a lot like SQL, I think pretty much any approach that fits here is

Re: [YAML] Aggregations

2023-10-19 Thread Byron Ellis via dev
Rill is definitely SQL-oriented but I think that's going to be the most common. Dataframes are explicitly modeled on the relational approach so that's going to look a lot like SQL, which leaves us with S-style formulas (which I like but are pretty niche) and I guess pivot tables coming from the

Re: [YAML] Aggregations

2023-10-18 Thread Robert Burke
MongoDB has its own concept of aggregation pipelines as well. https://www.mongodb.com/docs/manual/core/aggregation-pipeline/#std-label-aggregation-pipeline On Wed, Oct 18, 2023, 6:07 PM Robert Bradshaw via dev wrote: > On Wed, Oct 18, 2023 at 5:06 PM Byron Ellis wrote: > > > > Is it worth

Re: [YAML] Aggregations

2023-10-18 Thread Robert Bradshaw via dev
On Wed, Oct 18, 2023 at 5:06 PM Byron Ellis wrote: > > Is it worth taking a look at similar prior art in the space? +1. Pointers welcome. > The first one that comes to mind is Transform, but with the dbt labs > acquisition that spec is a lot harder to find. Rill is pretty similar though. Rill

Re: [YAML] Aggregations

2023-10-18 Thread Byron Ellis via dev
Is it worth taking a look at similar prior art in the space? The first one that comes to mind is Transform, but with the dbt labs acquisition that spec is a lot harder to find. Rill is pretty similar though. On Wed, Oct 18, 2023 at 1:12 PM Robert

[YAML] Aggregations

2023-10-18 Thread Robert Bradshaw via dev
Beam Yaml has good support for IOs and mappings, but one key missing feature for even writing a WordCount is the ability to do Aggregations [1]. While the traditional Beam primitive is GroupByKey (and CombineValues), we're eschewing KVs in the notion of more schema'd data (which has some