Re: [DISCUSS] Make RexNode serializable
BTW, if use 'externalize', what is the opposite side of it? Is it `internalize` (which doesn't sound right)? Or call it "de-externalize"? -Rui On Wed, Jul 8, 2020 at 4:02 PM Rui Wang wrote: > O got it :-) > > > -Rui > > On Wed, Jul 8, 2020 at 3:55 PM Julian Hyde wrote: > >> Please call it 'externalize'. 'Serialize' gives some folks PTSD. :) >> >> On Wed, Jul 8, 2020 at 2:26 PM Rui Wang wrote: >> > >> > Thanks everyone for your inputs. Now it sounds like RexNode >> serialization >> > is not an easy effort (e.g. as easy as making classes implement >> > Serializable). I will log a JIRA to document people's opinions. >> > >> > Currently I am leaning to add serialize()/deserialize() methods to >> RexNode, >> > and also see if we can improve RelToJson/JsonToRel to include RexNode >> > serialization/deserialization (which gives JSON, thus String). >> > >> > >> > -Rui >> > >> > On Wed, Jul 8, 2020 at 1:42 PM Julian Hyde wrote: >> > >> > > Serializabilty is not very popular in Java right now. There are a >> > > bunch of concerns, including security. Serializable classes are very >> > > brittle, because it's very easy to add a non-serializable field value >> > > in a sub-class. >> > > >> > > I strongly favor externalizing over serialization. Convert RexNode to >> > > a serializable type (e.g. java.lang.String!), and then convert it back >> > > on the other side. >> > > >> > > Julian >> > > >> > > >> > > On Tue, Jul 7, 2020 at 7:25 PM Danny Chan >> wrote: >> > > > >> > > > Serialize the RexNode as Json format is a solution but I’m afraid >> it can >> > > not solve the problem completely. >> > > > One problem with it is how to re-parse the json format back to >> RexNode, >> > > the current RelJsonReader can only re-parse the RelNode but not >> RexNode, >> > > and it needs the RelOptSchema to lookup the operators. >> > > > >> > > > In the distributed scenarios of Beam, I’m afraid it is hard to get >> the >> > > RelOptSchema because it is execution, we usually see the RelOptSchema >> > > during SQL compile time. >> > > > >> > > > Best, >> > > > Danny Chan >> > > > 在 2020年7月8日 +0800 AM3:39,Roman Kondakov > >,写道: >> > > > > Hi Rui, >> > > > > >> > > > > AFAIK, RelNodes can be serialized to and deserialized from JSON >> format. >> > > > > See test [1] as an example. If I understand it correct, RelNodes >> are >> > > > > serialized along with enclosed RexNodes, so you can transfer them >> over >> > > > > the network as plain strings. >> > > > > >> > > > > [1] >> > > > > >> > > >> https://github.com/apache/calcite/blob/f64cdcbb9f6535650f0227da19640e736496a9c3/core/src/test/java/org/apache/calcite/plan/RelWriterTest.java#L88 >> > > > > >> > > > > -- >> > > > > Roman Kondakov >> > > > > >> > > > > On 07.07.2020 22:13, Enrico Olivelli wrote: >> > > > > > Rui >> > > > > > >> > > > > > Il Mar 7 Lug 2020, 20:30 Rui Wang ha >> scritto: >> > > > > > >> > > > > > > Hi Community, >> > > > > > > >> > > > > > > In Apache Beam we are facing a use case where we need to keep >> > > RexNode in >> > > > > > > our distributed primitives. Because of the nature of >> distributed >> > > computing, >> > > > > > > Beam requires the usage of those primitives be serializable >> (thus >> > > those >> > > > > > > primitives can be sent over the network to backend/workers for >> > > > > > > further execution). >> > > > > > > >> > > > > > > In the Java world this requirement means to make RexNode >> implement >> > > the Java >> > > > > > > Serializable interface. >> > > > > > > >> > > > > > > A workaround right now is to create a bunch of classes to >> "clone" >> > > RexNode >> > > > > > > while making those classes implement the Serializable >> interface. >> > > > > > > >> > > > > > >> > > > > > Did you evaluate to use some framework like Kryo that allows >> you to >> > > > > > serialize Jon serializable classes? >> > > > > > >> > > > > > I think that in general Java serialisation is not efficient as >> it is >> > > too >> > > > > > general purpose. >> > > > > > It also brings in a few Security issues. >> > > > > > >> > > > > > Maybe an alternative idea is to add some serialisation ad-hoc >> > > mechanism in >> > > > > > RexNode. >> > > > > > We should also ensure that every RexNode will be able to be >> > > serialized and >> > > > > > deserialized. >> > > > > > >> > > > > > Enrico >> > > > > > >> > > > > > >> > > > > > > So what do you think of the idea that makes RexNode implement >> the >> > > > > > > Serializable interface? >> > > > > > > >> > > > > > > >> > > > > > > -Rui >> > > > > > > >> > > > > > >> > > >> >
Re: [DISCUSS] Make RexNode serializable
O got it :-) -Rui On Wed, Jul 8, 2020 at 3:55 PM Julian Hyde wrote: > Please call it 'externalize'. 'Serialize' gives some folks PTSD. :) > > On Wed, Jul 8, 2020 at 2:26 PM Rui Wang wrote: > > > > Thanks everyone for your inputs. Now it sounds like RexNode serialization > > is not an easy effort (e.g. as easy as making classes implement > > Serializable). I will log a JIRA to document people's opinions. > > > > Currently I am leaning to add serialize()/deserialize() methods to > RexNode, > > and also see if we can improve RelToJson/JsonToRel to include RexNode > > serialization/deserialization (which gives JSON, thus String). > > > > > > -Rui > > > > On Wed, Jul 8, 2020 at 1:42 PM Julian Hyde wrote: > > > > > Serializabilty is not very popular in Java right now. There are a > > > bunch of concerns, including security. Serializable classes are very > > > brittle, because it's very easy to add a non-serializable field value > > > in a sub-class. > > > > > > I strongly favor externalizing over serialization. Convert RexNode to > > > a serializable type (e.g. java.lang.String!), and then convert it back > > > on the other side. > > > > > > Julian > > > > > > > > > On Tue, Jul 7, 2020 at 7:25 PM Danny Chan > wrote: > > > > > > > > Serialize the RexNode as Json format is a solution but I’m afraid it > can > > > not solve the problem completely. > > > > One problem with it is how to re-parse the json format back to > RexNode, > > > the current RelJsonReader can only re-parse the RelNode but not > RexNode, > > > and it needs the RelOptSchema to lookup the operators. > > > > > > > > In the distributed scenarios of Beam, I’m afraid it is hard to get > the > > > RelOptSchema because it is execution, we usually see the RelOptSchema > > > during SQL compile time. > > > > > > > > Best, > > > > Danny Chan > > > > 在 2020年7月8日 +0800 AM3:39,Roman Kondakov >,写道: > > > > > Hi Rui, > > > > > > > > > > AFAIK, RelNodes can be serialized to and deserialized from JSON > format. > > > > > See test [1] as an example. If I understand it correct, RelNodes > are > > > > > serialized along with enclosed RexNodes, so you can transfer them > over > > > > > the network as plain strings. > > > > > > > > > > [1] > > > > > > > > > https://github.com/apache/calcite/blob/f64cdcbb9f6535650f0227da19640e736496a9c3/core/src/test/java/org/apache/calcite/plan/RelWriterTest.java#L88 > > > > > > > > > > -- > > > > > Roman Kondakov > > > > > > > > > > On 07.07.2020 22:13, Enrico Olivelli wrote: > > > > > > Rui > > > > > > > > > > > > Il Mar 7 Lug 2020, 20:30 Rui Wang ha > scritto: > > > > > > > > > > > > > Hi Community, > > > > > > > > > > > > > > In Apache Beam we are facing a use case where we need to keep > > > RexNode in > > > > > > > our distributed primitives. Because of the nature of > distributed > > > computing, > > > > > > > Beam requires the usage of those primitives be serializable > (thus > > > those > > > > > > > primitives can be sent over the network to backend/workers for > > > > > > > further execution). > > > > > > > > > > > > > > In the Java world this requirement means to make RexNode > implement > > > the Java > > > > > > > Serializable interface. > > > > > > > > > > > > > > A workaround right now is to create a bunch of classes to > "clone" > > > RexNode > > > > > > > while making those classes implement the Serializable > interface. > > > > > > > > > > > > > > > > > > > Did you evaluate to use some framework like Kryo that allows you > to > > > > > > serialize Jon serializable classes? > > > > > > > > > > > > I think that in general Java serialisation is not efficient as > it is > > > too > > > > > > general purpose. > > > > > > It also brings in a few Security issues. > > > > > > > > > > > > Maybe an alternative idea is to add some serialisation ad-hoc > > > mechanism in > > > > > > RexNode. > > > > > > We should also ensure that every RexNode will be able to be > > > serialized and > > > > > > deserialized. > > > > > > > > > > > > Enrico > > > > > > > > > > > > > > > > > > > So what do you think of the idea that makes RexNode implement > the > > > > > > > Serializable interface? > > > > > > > > > > > > > > > > > > > > > -Rui > > > > > > > > > > > > > > > > >
Re: [DISCUSS] Make RexNode serializable
Please call it 'externalize'. 'Serialize' gives some folks PTSD. :) On Wed, Jul 8, 2020 at 2:26 PM Rui Wang wrote: > > Thanks everyone for your inputs. Now it sounds like RexNode serialization > is not an easy effort (e.g. as easy as making classes implement > Serializable). I will log a JIRA to document people's opinions. > > Currently I am leaning to add serialize()/deserialize() methods to RexNode, > and also see if we can improve RelToJson/JsonToRel to include RexNode > serialization/deserialization (which gives JSON, thus String). > > > -Rui > > On Wed, Jul 8, 2020 at 1:42 PM Julian Hyde wrote: > > > Serializabilty is not very popular in Java right now. There are a > > bunch of concerns, including security. Serializable classes are very > > brittle, because it's very easy to add a non-serializable field value > > in a sub-class. > > > > I strongly favor externalizing over serialization. Convert RexNode to > > a serializable type (e.g. java.lang.String!), and then convert it back > > on the other side. > > > > Julian > > > > > > On Tue, Jul 7, 2020 at 7:25 PM Danny Chan wrote: > > > > > > Serialize the RexNode as Json format is a solution but I’m afraid it can > > not solve the problem completely. > > > One problem with it is how to re-parse the json format back to RexNode, > > the current RelJsonReader can only re-parse the RelNode but not RexNode, > > and it needs the RelOptSchema to lookup the operators. > > > > > > In the distributed scenarios of Beam, I’m afraid it is hard to get the > > RelOptSchema because it is execution, we usually see the RelOptSchema > > during SQL compile time. > > > > > > Best, > > > Danny Chan > > > 在 2020年7月8日 +0800 AM3:39,Roman Kondakov ,写道: > > > > Hi Rui, > > > > > > > > AFAIK, RelNodes can be serialized to and deserialized from JSON format. > > > > See test [1] as an example. If I understand it correct, RelNodes are > > > > serialized along with enclosed RexNodes, so you can transfer them over > > > > the network as plain strings. > > > > > > > > [1] > > > > > > https://github.com/apache/calcite/blob/f64cdcbb9f6535650f0227da19640e736496a9c3/core/src/test/java/org/apache/calcite/plan/RelWriterTest.java#L88 > > > > > > > > -- > > > > Roman Kondakov > > > > > > > > On 07.07.2020 22:13, Enrico Olivelli wrote: > > > > > Rui > > > > > > > > > > Il Mar 7 Lug 2020, 20:30 Rui Wang ha scritto: > > > > > > > > > > > Hi Community, > > > > > > > > > > > > In Apache Beam we are facing a use case where we need to keep > > RexNode in > > > > > > our distributed primitives. Because of the nature of distributed > > computing, > > > > > > Beam requires the usage of those primitives be serializable (thus > > those > > > > > > primitives can be sent over the network to backend/workers for > > > > > > further execution). > > > > > > > > > > > > In the Java world this requirement means to make RexNode implement > > the Java > > > > > > Serializable interface. > > > > > > > > > > > > A workaround right now is to create a bunch of classes to "clone" > > RexNode > > > > > > while making those classes implement the Serializable interface. > > > > > > > > > > > > > > > > Did you evaluate to use some framework like Kryo that allows you to > > > > > serialize Jon serializable classes? > > > > > > > > > > I think that in general Java serialisation is not efficient as it is > > too > > > > > general purpose. > > > > > It also brings in a few Security issues. > > > > > > > > > > Maybe an alternative idea is to add some serialisation ad-hoc > > mechanism in > > > > > RexNode. > > > > > We should also ensure that every RexNode will be able to be > > serialized and > > > > > deserialized. > > > > > > > > > > Enrico > > > > > > > > > > > > > > > > So what do you think of the idea that makes RexNode implement the > > > > > > Serializable interface? > > > > > > > > > > > > > > > > > > -Rui > > > > > > > > > > > > >
Re: [DISCUSS] Make RexNode serializable
Thanks everyone for your inputs. Now it sounds like RexNode serialization is not an easy effort (e.g. as easy as making classes implement Serializable). I will log a JIRA to document people's opinions. Currently I am leaning to add serialize()/deserialize() methods to RexNode, and also see if we can improve RelToJson/JsonToRel to include RexNode serialization/deserialization (which gives JSON, thus String). -Rui On Wed, Jul 8, 2020 at 1:42 PM Julian Hyde wrote: > Serializabilty is not very popular in Java right now. There are a > bunch of concerns, including security. Serializable classes are very > brittle, because it's very easy to add a non-serializable field value > in a sub-class. > > I strongly favor externalizing over serialization. Convert RexNode to > a serializable type (e.g. java.lang.String!), and then convert it back > on the other side. > > Julian > > > On Tue, Jul 7, 2020 at 7:25 PM Danny Chan wrote: > > > > Serialize the RexNode as Json format is a solution but I’m afraid it can > not solve the problem completely. > > One problem with it is how to re-parse the json format back to RexNode, > the current RelJsonReader can only re-parse the RelNode but not RexNode, > and it needs the RelOptSchema to lookup the operators. > > > > In the distributed scenarios of Beam, I’m afraid it is hard to get the > RelOptSchema because it is execution, we usually see the RelOptSchema > during SQL compile time. > > > > Best, > > Danny Chan > > 在 2020年7月8日 +0800 AM3:39,Roman Kondakov ,写道: > > > Hi Rui, > > > > > > AFAIK, RelNodes can be serialized to and deserialized from JSON format. > > > See test [1] as an example. If I understand it correct, RelNodes are > > > serialized along with enclosed RexNodes, so you can transfer them over > > > the network as plain strings. > > > > > > [1] > > > > https://github.com/apache/calcite/blob/f64cdcbb9f6535650f0227da19640e736496a9c3/core/src/test/java/org/apache/calcite/plan/RelWriterTest.java#L88 > > > > > > -- > > > Roman Kondakov > > > > > > On 07.07.2020 22:13, Enrico Olivelli wrote: > > > > Rui > > > > > > > > Il Mar 7 Lug 2020, 20:30 Rui Wang ha scritto: > > > > > > > > > Hi Community, > > > > > > > > > > In Apache Beam we are facing a use case where we need to keep > RexNode in > > > > > our distributed primitives. Because of the nature of distributed > computing, > > > > > Beam requires the usage of those primitives be serializable (thus > those > > > > > primitives can be sent over the network to backend/workers for > > > > > further execution). > > > > > > > > > > In the Java world this requirement means to make RexNode implement > the Java > > > > > Serializable interface. > > > > > > > > > > A workaround right now is to create a bunch of classes to "clone" > RexNode > > > > > while making those classes implement the Serializable interface. > > > > > > > > > > > > > Did you evaluate to use some framework like Kryo that allows you to > > > > serialize Jon serializable classes? > > > > > > > > I think that in general Java serialisation is not efficient as it is > too > > > > general purpose. > > > > It also brings in a few Security issues. > > > > > > > > Maybe an alternative idea is to add some serialisation ad-hoc > mechanism in > > > > RexNode. > > > > We should also ensure that every RexNode will be able to be > serialized and > > > > deserialized. > > > > > > > > Enrico > > > > > > > > > > > > > So what do you think of the idea that makes RexNode implement the > > > > > Serializable interface? > > > > > > > > > > > > > > > -Rui > > > > > > > > > >
Re: [DISCUSS] Make RexNode serializable
Serializabilty is not very popular in Java right now. There are a bunch of concerns, including security. Serializable classes are very brittle, because it's very easy to add a non-serializable field value in a sub-class. I strongly favor externalizing over serialization. Convert RexNode to a serializable type (e.g. java.lang.String!), and then convert it back on the other side. Julian On Tue, Jul 7, 2020 at 7:25 PM Danny Chan wrote: > > Serialize the RexNode as Json format is a solution but I’m afraid it can not > solve the problem completely. > One problem with it is how to re-parse the json format back to RexNode, the > current RelJsonReader can only re-parse the RelNode but not RexNode, and it > needs the RelOptSchema to lookup the operators. > > In the distributed scenarios of Beam, I’m afraid it is hard to get the > RelOptSchema because it is execution, we usually see the RelOptSchema during > SQL compile time. > > Best, > Danny Chan > 在 2020年7月8日 +0800 AM3:39,Roman Kondakov ,写道: > > Hi Rui, > > > > AFAIK, RelNodes can be serialized to and deserialized from JSON format. > > See test [1] as an example. If I understand it correct, RelNodes are > > serialized along with enclosed RexNodes, so you can transfer them over > > the network as plain strings. > > > > [1] > > https://github.com/apache/calcite/blob/f64cdcbb9f6535650f0227da19640e736496a9c3/core/src/test/java/org/apache/calcite/plan/RelWriterTest.java#L88 > > > > -- > > Roman Kondakov > > > > On 07.07.2020 22:13, Enrico Olivelli wrote: > > > Rui > > > > > > Il Mar 7 Lug 2020, 20:30 Rui Wang ha scritto: > > > > > > > Hi Community, > > > > > > > > In Apache Beam we are facing a use case where we need to keep RexNode in > > > > our distributed primitives. Because of the nature of distributed > > > > computing, > > > > Beam requires the usage of those primitives be serializable (thus those > > > > primitives can be sent over the network to backend/workers for > > > > further execution). > > > > > > > > In the Java world this requirement means to make RexNode implement the > > > > Java > > > > Serializable interface. > > > > > > > > A workaround right now is to create a bunch of classes to "clone" > > > > RexNode > > > > while making those classes implement the Serializable interface. > > > > > > > > > > Did you evaluate to use some framework like Kryo that allows you to > > > serialize Jon serializable classes? > > > > > > I think that in general Java serialisation is not efficient as it is too > > > general purpose. > > > It also brings in a few Security issues. > > > > > > Maybe an alternative idea is to add some serialisation ad-hoc mechanism in > > > RexNode. > > > We should also ensure that every RexNode will be able to be serialized and > > > deserialized. > > > > > > Enrico > > > > > > > > > > So what do you think of the idea that makes RexNode implement the > > > > Serializable interface? > > > > > > > > > > > > -Rui > > > > > > >
Re: [DISCUSS] Make RexNode serializable
Serialize the RexNode as Json format is a solution but I’m afraid it can not solve the problem completely. One problem with it is how to re-parse the json format back to RexNode, the current RelJsonReader can only re-parse the RelNode but not RexNode, and it needs the RelOptSchema to lookup the operators. In the distributed scenarios of Beam, I’m afraid it is hard to get the RelOptSchema because it is execution, we usually see the RelOptSchema during SQL compile time. Best, Danny Chan 在 2020年7月8日 +0800 AM3:39,Roman Kondakov ,写道: > Hi Rui, > > AFAIK, RelNodes can be serialized to and deserialized from JSON format. > See test [1] as an example. If I understand it correct, RelNodes are > serialized along with enclosed RexNodes, so you can transfer them over > the network as plain strings. > > [1] > https://github.com/apache/calcite/blob/f64cdcbb9f6535650f0227da19640e736496a9c3/core/src/test/java/org/apache/calcite/plan/RelWriterTest.java#L88 > > -- > Roman Kondakov > > On 07.07.2020 22:13, Enrico Olivelli wrote: > > Rui > > > > Il Mar 7 Lug 2020, 20:30 Rui Wang ha scritto: > > > > > Hi Community, > > > > > > In Apache Beam we are facing a use case where we need to keep RexNode in > > > our distributed primitives. Because of the nature of distributed > > > computing, > > > Beam requires the usage of those primitives be serializable (thus those > > > primitives can be sent over the network to backend/workers for > > > further execution). > > > > > > In the Java world this requirement means to make RexNode implement the > > > Java > > > Serializable interface. > > > > > > A workaround right now is to create a bunch of classes to "clone" RexNode > > > while making those classes implement the Serializable interface. > > > > > > > Did you evaluate to use some framework like Kryo that allows you to > > serialize Jon serializable classes? > > > > I think that in general Java serialisation is not efficient as it is too > > general purpose. > > It also brings in a few Security issues. > > > > Maybe an alternative idea is to add some serialisation ad-hoc mechanism in > > RexNode. > > We should also ensure that every RexNode will be able to be serialized and > > deserialized. > > > > Enrico > > > > > > > So what do you think of the idea that makes RexNode implement the > > > Serializable interface? > > > > > > > > > -Rui > > > > >
Re: [DISCUSS] Make RexNode serializable
Hi Rui, AFAIK, RelNodes can be serialized to and deserialized from JSON format. See test [1] as an example. If I understand it correct, RelNodes are serialized along with enclosed RexNodes, so you can transfer them over the network as plain strings. [1] https://github.com/apache/calcite/blob/f64cdcbb9f6535650f0227da19640e736496a9c3/core/src/test/java/org/apache/calcite/plan/RelWriterTest.java#L88 -- Roman Kondakov On 07.07.2020 22:13, Enrico Olivelli wrote: Rui Il Mar 7 Lug 2020, 20:30 Rui Wang ha scritto: Hi Community, In Apache Beam we are facing a use case where we need to keep RexNode in our distributed primitives. Because of the nature of distributed computing, Beam requires the usage of those primitives be serializable (thus those primitives can be sent over the network to backend/workers for further execution). In the Java world this requirement means to make RexNode implement the Java Serializable interface. A workaround right now is to create a bunch of classes to "clone" RexNode while making those classes implement the Serializable interface. Did you evaluate to use some framework like Kryo that allows you to serialize Jon serializable classes? I think that in general Java serialisation is not efficient as it is too general purpose. It also brings in a few Security issues. Maybe an alternative idea is to add some serialisation ad-hoc mechanism in RexNode. We should also ensure that every RexNode will be able to be serialized and deserialized. Enrico So what do you think of the idea that makes RexNode implement the Serializable interface? -Rui
Re: [DISCUSS] Make RexNode serializable
Rui Il Mar 7 Lug 2020, 20:30 Rui Wang ha scritto: > Hi Community, > > In Apache Beam we are facing a use case where we need to keep RexNode in > our distributed primitives. Because of the nature of distributed computing, > Beam requires the usage of those primitives be serializable (thus those > primitives can be sent over the network to backend/workers for > further execution). > > In the Java world this requirement means to make RexNode implement the Java > Serializable interface. > > A workaround right now is to create a bunch of classes to "clone" RexNode > while making those classes implement the Serializable interface. > Did you evaluate to use some framework like Kryo that allows you to serialize Jon serializable classes? I think that in general Java serialisation is not efficient as it is too general purpose. It also brings in a few Security issues. Maybe an alternative idea is to add some serialisation ad-hoc mechanism in RexNode. We should also ensure that every RexNode will be able to be serialized and deserialized. Enrico > So what do you think of the idea that makes RexNode implement the > Serializable interface? > > > -Rui >