The main is idea of wayang is to provide a layer that pick the best combination of platform to process a query, you can see the details on the paper rheemix[1]
Then providing a SQL-API will allow to transform a query into different operators of wayang that will allow optimization with platform that only have SQL like postgres with platforms that don’t SQL lenguaje like giraph. The idea to use calcite, is coming from the intermediate representation that calcite generates that will allows us to create the wayang plan with an “udf” that are translateble again to SQL or translatable to a executable code that can be executed by flink, as an example. Imagen the query that it said something like: Select A.a,A.b,A.c from A join A.a = X.a …. Then X(10TB) is on HDFS and A(100MB) is on postgres, then the plan to execute will something like: Select A.a from A(1MB), this file is small then you can do broadcast and filter using flink. Then the join results are just 2 records, the wayang will perform the query on postgres using the 2 record as condition. But also could occurs that the join answer is 1TB, in that case, the data of postgres will be move to HDFS and the all the rest of the process will be on using flink. Currently the optimizer is taking the decision of what platform will be used depending on the amount of data to process and data movement. Then the SQL-API will provide an way of “freedom” the decisions because we will have all the intermediate representation to performs changes. After we have the SQL-API we will be adding platforms that just support and SQL ;), as you said. The idea of using the intermediate representation it maybe sound weird to you, but we can have a meeting to explain you better, then you can understand better the full concept and also give us your feedback, let me if hyou are available and when and I will freedom my schedule for it ;). I’m in Germany just to you figure if we have some timezone differences ;). Best regards, Bertty [1] https://wayang.apache.org/assets/pdf/paper/journal_vldb.pdf On Sun 2. Jan 2022 at 17:43, kamalesh palanisamy <[email protected]> wrote: > Hi Bertty, > Thank you for the information! I would love to work on adding the SQL API > for Wayang. Basically, now I need to add a new platform for the > wayang-platforms that supports SQL through apache calcite? Am I right? > Please do correct me if I am wrong. > > Thanks, > Kamalesh P > > > On Sun, Jan 2, 2022 at 3:36 AM Bertty Contreras <[email protected]> > wrote: > >> Hi Kamalesh, >> >> Currently, Apache Wayang(Incubating) has the issues listed in Jira [1]. >> One feature that the community didn't have time to work on is the SQL API >> for Apache Wayang(Incubating) [2]; the main idea is to use Apache Calcite >> [3] as the parser of the SQL and then do something like Spark adapter of >> calcite [4]. If you want to contribute to this feature, it will be so >> awesome :D. >> >> If you found another issue interesting, let me know, or even if you have >> some idea of a feature will be so awesome too :D >> >> Best regards, >> Bertty >> >> [1] https://issues.apache.org/jira/projects/WAYANG >> [2] >> https://issues.apache.org/jira/projects/WAYANG/issues/WAYANG-25?filter=allopenissues >> [3] https://calcite.apache.org >> [4] https://github.com/apache/calcite/tree/master/spark >> >> On Sun, Jan 2, 2022 at 6:50 AM kamalesh palanisamy <[email protected]> >> wrote: >> >>> Hi, >>> My name is Kamalesh and I am currently looking to contribute to the >>> project, but I couldn't find any proper issues. Can you help me with any >>> features you would like me to contribute to?. Thanks! >>> Thanks, >>> Kamalesh P >>> >>
