[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-10-06 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r989392831


##
connector/connect/src/main/protobuf/spark/connect/expressions.proto:
##
@@ -159,4 +159,17 @@ message Expression {
   // UnresolvedStar is used to expand all the fields of a relation or struct.
   message UnresolvedStar {
   }
+
+  // An resolved attribute that can specify a reference (e.g. column) without 
needing a resolution
+  // by the analyzer.
+  message Attribute {

Review Comment:
   Renamed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-10-06 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r989392574


##
connector/connect/src/main/protobuf/spark/connect/expressions.proto:
##
@@ -159,4 +159,17 @@ message Expression {
   // UnresolvedStar is used to expand all the fields of a relation or struct.
   message UnresolvedStar {
   }
+
+  // An resolved attribute that can specify a reference (e.g. column) without 
needing a resolution
+  // by the analyzer.
+  message Attribute {
+string name = 1;
+// TODO: Full definition of an attribute is at 
https://github.com/apache/spark/blob/4f4a080a78c3979961e8483aace663bdc45f489d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala#L105
+// bool Nullability = 2;
+//  repeated string Qualifier = 3;
+//  int64 exprId = 4;
+//  string UUID = 6;

Review Comment:
   removed



##
connector/connect/src/main/protobuf/spark/connect/expressions.proto:
##
@@ -159,4 +159,17 @@ message Expression {
   // UnresolvedStar is used to expand all the fields of a relation or struct.
   message UnresolvedStar {
   }
+
+  // An resolved attribute that can specify a reference (e.g. column) without 
needing a resolution

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-10-06 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r988644725


##
connector/connect/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.connect.planner
+
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions.AttributeReference
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.catalyst.plans.logical.LocalRelation
+import org.apache.spark.sql.types.{DataType, IntegerType}
+
+/**
+ * This suite is based on connect DSL and test that given same dataframe 
operations, whether
+ * connect could construct a proto plan that can be translated back, and after 
analyzed, be the
+ * same as Spark dataframe's generated plan.
+ */
+class SparkConnectProtoSuite extends PlanTest with SparkConnectPlanTest {
+
+  lazy val connectTestRelation = createLocalRelationProto(Map("id" -> 
IntegerType))
+
+  lazy val sparkTestRelation: LocalRelation = 
LocalRelation(AttributeReference("id", IntegerType)())

Review Comment:
   yes. this is perfect. 
   
   Many tricks to learn of how to use catalyst DSL...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-10-06 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r988643026


##
connect/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.connect.planner
+
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions.AttributeReference
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.catalyst.plans.logical.LocalRelation
+import org.apache.spark.sql.types.{DataType, IntegerType}
+
+/**
+ * This suite is based on connect DSL and test that given same dataframe 
operations, whether
+ * connect could construct a proto plan that can be translated back, and after 
analyzed, be the
+ * same as Spark dataframe's generated plan.
+ */
+class SparkConnectProtoSuite extends PlanTest with SparkConnectPlanTest {
+
+  lazy val connectTestRelation = createLocalRelationProto(Map("id" -> 
IntegerType))
+
+  lazy val sparkTestRelation: LocalRelation = 
LocalRelation(AttributeReference("id", IntegerType)())
+
+  test("Basic select") {
+val connectPlan = {
+  // TODO: Scala only allows one implicit per scope so we keep proto 
implicit imports in
+  // this scope. Need to find a better way to make two implicits work in 
the smae scope.
+  import org.apache.spark.sql.catalyst.connect.expressions._
+  import org.apache.spark.sql.catalyst.connect.plans._
+  transform(connectTestRelation.select("id".protoAttr))
+}
+val sparkPlan = sparkTestRelation.select("id")
+comparePlans(connectPlan.analyze, sparkPlan.analyze, false)
+  }
+
+  private def createLocalRelationProto(colNameTypeMap: Map[String, DataType]): 
proto.Relation = {

Review Comment:
   Seq[AttributeReference] is better actually based on your comment below.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-10-06 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r988641745


##
connect/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.connect.planner
+
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions.AttributeReference
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.catalyst.plans.logical.LocalRelation
+import org.apache.spark.sql.types.{DataType, IntegerType}
+
+/**
+ * This suite is based on connect DSL and test that given same dataframe 
operations, whether
+ * connect could construct a proto plan that can be translated back, and after 
analyzed, be the
+ * same as Spark dataframe's generated plan.
+ */
+class SparkConnectProtoSuite extends PlanTest with SparkConnectPlanTest {
+
+  lazy val connectTestRelation = createLocalRelationProto(Map("id" -> 
IntegerType))
+
+  lazy val sparkTestRelation: LocalRelation = 
LocalRelation(AttributeReference("id", IntegerType)())
+
+  test("Basic select") {
+val connectPlan = {
+  // TODO: Scala only allows one implicit per scope so we keep proto 
implicit imports in
+  // this scope. Need to find a better way to make two implicits work in 
the smae scope.
+  import org.apache.spark.sql.catalyst.connect.expressions._
+  import org.apache.spark.sql.catalyst.connect.plans._
+  transform(connectTestRelation.select("id".protoAttr))
+}
+val sparkPlan = sparkTestRelation.select("id")
+comparePlans(connectPlan.analyze, sparkPlan.analyze, false)
+  }
+
+  private def createLocalRelationProto(colNameTypeMap: Map[String, DataType]): 
proto.Relation = {

Review Comment:
   Good point on the ordering.
   
   I changed to use `Seq[(String, DataType)]`. I guess eventually I will change 
it to `Seq[AttributeReference]` when we are ready to test all the fields of a 
`AttributeReference`. Right now only String is really used due to no data type 
support in proto.



##
connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##
@@ -111,8 +122,12 @@ class SparkConnectPlanner(plan: proto.Relation, session: 
SparkSession) {
 }
   }
 
-  private def transformUnresolvedExpression(exp: proto.Expression): 
UnresolvedAttribute = {
-UnresolvedAttribute(exp.getUnresolvedAttribute.getPartsList.asScala.toSeq)
+  private def transformUnresolvedExpression(exp: proto.Expression): Expression 
= {
+if (exp.getUnresolvedAttribute.getPartsCount == 1) {
+  Literal.create(exp.getUnresolvedAttribute.getParts(0), StringType)

Review Comment:
   done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-10-06 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r988621244


##
connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##
@@ -111,8 +122,12 @@ class SparkConnectPlanner(plan: proto.Relation, session: 
SparkSession) {
 }
   }
 
-  private def transformUnresolvedExpression(exp: proto.Expression): 
UnresolvedAttribute = {
-UnresolvedAttribute(exp.getUnresolvedAttribute.getPartsList.asScala.toSeq)
+  private def transformUnresolvedExpression(exp: proto.Expression): Expression 
= {
+if (exp.getUnresolvedAttribute.getPartsCount == 1) {
+  Literal.create(exp.getUnresolvedAttribute.getParts(0), StringType)

Review Comment:
   I need to change this is to match catalyst DSL. 
   
   catalyst DSL generates a string literal for `a` in `relation.select("a")`. 
   
   If we think catalyst DSL is the standard way to generate logical plan then 
we should also follow it in connect. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-28 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r982725060


##
connect/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.connect.planner
+
+import org.scalatest.BeforeAndAfter
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.catalyst.connect.plans.DslLogicalPlan
+import org.apache.spark.sql.catalyst.connect.plans.DslString
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.test.SharedSparkSession
+
+/**
+ * This suite is based on connect DSL and test that given same dataframe 
operations, whether
+ * connect could construct a proto plan that can be translated back, and after 
analyzed, be the
+ * same as Spark dataframe's generated plan.
+ */
+class SparkConnectProtoSuite extends SparkFunSuite
+  with SharedSparkSession with SparkConnectPlanTest with PlanTest with 
BeforeAndAfter {
+
+  lazy val connectTestRelation =
+proto.Relation.newBuilder()
+  .setRead(
+proto.Read.newBuilder()
+
.setNamedTable(proto.Read.NamedTable.newBuilder().addParts("student").build())
+.build())
+  .build()
+
+  lazy val sparkTestRelation = spark.table("student")
+
+  before {
+setupTestData()
+  }
+
+  test("Basic select") {
+val connectPlan = analyze(connectTestRelation.select("id".protoAttr))
+val sparkPlan = sparkTestRelation.select("id").queryExecution.analyzed

Review Comment:
   I see what you are saying. In this case we need to support `LocalRelation` 
in connect proto and let connect plan generated from `LocalRelation` proto 
(otherwise connect plan will be different from the spark plan which is a 
LocalRelation).
   
   Is this what you prefer?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-28 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r982038430


##
connect/src/main/scala/org/apache/spark/sql/catalyst/connect/connect.scala:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.catalyst
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.catalyst.analysis.UnresolvedAlias
+import org.apache.spark.sql.catalyst.expressions.{Expression, NamedExpression}
+
+/**
+ * A collection of implicit conversions that create a DSL for constructing 
connect protos.
+ *
+ * {{{
+ *   scala> import org.apache.spark.sql.connect.plans.DslLogicalPlan
+ *
+ *   // Standard way to construct connect proto
+ *   scala> import org.apache.spark.connect.proto
+ *   scala> :paste
+ *   // Entering paste mode (ctrl-D to finish)
+ *   val connectTestRelation =
+ *proto.Relation.newBuilder()
+ * .setRead(
+ *   proto.Read.newBuilder()
+ *   
.setNamedTable(proto.Read.NamedTable.newBuilder().addParts("student").build())
+ *   .build())
+ * .build()
+ *  // Exiting paste mode, now interpreting.
+ *connectTestRelation: org.apache.spark.connect.proto.Relation =
+ * read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *
+ *   // Now we can apply select on the proto relation above
+ *   scala> import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
+ *   scala> connectTestRelation.select(UnresolvedAttribute(Seq("id")))
+ *   res14: org.apache.spark.connect.proto.Relation =
+ *project {
+ *  input {
+ *read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *  }
+ *  expressions {
+ *unresolved_attribute {
+ *  parts: "id"
+ *}
+ *  }
+ *}
+ *
+ * }}}
+ *
+ */
+package object connect {
+
+  object plans { // scalastyle:ignore
+implicit class DslLogicalPlan(val logicalPlan: proto.Relation) {
+  def select(exprs: Expression*): proto.Relation = {
+val namedExpressions = exprs.map {
+  case e: NamedExpression => e
+  case e => UnresolvedAlias(e)
+}
+
+proto.Relation.newBuilder().setProject(
+  proto.Project.newBuilder()
+.setInput(logicalPlan)
+.addExpressions(
+  proto.Expression.newBuilder()
+.setUnresolvedAttribute(

Review Comment:
   Added the DSL part for expressions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-27 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981927808


##
connect/src/main/scala/org/apache/spark/sql/catalyst/connect/connect.scala:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.catalyst
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.catalyst.analysis.UnresolvedAlias
+import org.apache.spark.sql.catalyst.expressions.{Expression, NamedExpression}
+
+/**
+ * A collection of implicit conversions that create a DSL for constructing 
connect protos.
+ *
+ * {{{
+ *   scala> import org.apache.spark.sql.connect.plans.DslLogicalPlan
+ *
+ *   // Standard way to construct connect proto
+ *   scala> import org.apache.spark.connect.proto
+ *   scala> :paste
+ *   // Entering paste mode (ctrl-D to finish)
+ *   val connectTestRelation =
+ *proto.Relation.newBuilder()
+ * .setRead(
+ *   proto.Read.newBuilder()
+ *   
.setNamedTable(proto.Read.NamedTable.newBuilder().addParts("student").build())
+ *   .build())
+ * .build()
+ *  // Exiting paste mode, now interpreting.
+ *connectTestRelation: org.apache.spark.connect.proto.Relation =
+ * read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *
+ *   // Now we can apply select on the proto relation above
+ *   scala> import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
+ *   scala> connectTestRelation.select(UnresolvedAttribute(Seq("id")))
+ *   res14: org.apache.spark.connect.proto.Relation =
+ *project {
+ *  input {
+ *read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *  }
+ *  expressions {
+ *unresolved_attribute {
+ *  parts: "id"
+ *}
+ *  }
+ *}
+ *
+ * }}}
+ *
+ */
+package object connect {
+
+  object plans { // scalastyle:ignore
+implicit class DslLogicalPlan(val logicalPlan: proto.Relation) {
+  def select(exprs: Expression*): proto.Relation = {
+val namedExpressions = exprs.map {
+  case e: NamedExpression => e
+  case e => UnresolvedAlias(e)
+}
+
+proto.Relation.newBuilder().setProject(
+  proto.Project.newBuilder()
+.setInput(logicalPlan)
+.addExpressions(
+  proto.Expression.newBuilder()
+.setUnresolvedAttribute(

Review Comment:
   I see why you might be confused: 
   
   the code here is definitely not complete (of course it is hard to make it 
complete from the beginning).  
   
   I plan to expand this when we start to implement one API after another. 
   
   Basically when we start to check the `Projection` support, we then test 
every type of the expression for a `select` on a relation. After that this part 
will be complete (I hope). I think that PR might be titled `improve projection 
coverage`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-27 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981927808


##
connect/src/main/scala/org/apache/spark/sql/catalyst/connect/connect.scala:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.catalyst
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.catalyst.analysis.UnresolvedAlias
+import org.apache.spark.sql.catalyst.expressions.{Expression, NamedExpression}
+
+/**
+ * A collection of implicit conversions that create a DSL for constructing 
connect protos.
+ *
+ * {{{
+ *   scala> import org.apache.spark.sql.connect.plans.DslLogicalPlan
+ *
+ *   // Standard way to construct connect proto
+ *   scala> import org.apache.spark.connect.proto
+ *   scala> :paste
+ *   // Entering paste mode (ctrl-D to finish)
+ *   val connectTestRelation =
+ *proto.Relation.newBuilder()
+ * .setRead(
+ *   proto.Read.newBuilder()
+ *   
.setNamedTable(proto.Read.NamedTable.newBuilder().addParts("student").build())
+ *   .build())
+ * .build()
+ *  // Exiting paste mode, now interpreting.
+ *connectTestRelation: org.apache.spark.connect.proto.Relation =
+ * read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *
+ *   // Now we can apply select on the proto relation above
+ *   scala> import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
+ *   scala> connectTestRelation.select(UnresolvedAttribute(Seq("id")))
+ *   res14: org.apache.spark.connect.proto.Relation =
+ *project {
+ *  input {
+ *read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *  }
+ *  expressions {
+ *unresolved_attribute {
+ *  parts: "id"
+ *}
+ *  }
+ *}
+ *
+ * }}}
+ *
+ */
+package object connect {
+
+  object plans { // scalastyle:ignore
+implicit class DslLogicalPlan(val logicalPlan: proto.Relation) {
+  def select(exprs: Expression*): proto.Relation = {
+val namedExpressions = exprs.map {
+  case e: NamedExpression => e
+  case e => UnresolvedAlias(e)
+}
+
+proto.Relation.newBuilder().setProject(
+  proto.Project.newBuilder()
+.setInput(logicalPlan)
+.addExpressions(
+  proto.Expression.newBuilder()
+.setUnresolvedAttribute(

Review Comment:
   I see why you might be confused: 
   
   the code here is definitely not complete (of course it is hard to make it 
complete from the beginning).  
   
   I plan to expand this when we start to implement one API after another. 
   
   Basically when we start to check the `Projection` support, we then test 
every type of the expression for a `select` on a relation. After that this part 
will be complete (I hope).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-27 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981920306


##
connect/src/main/scala/org/apache/spark/sql/catalyst/connect/connect.scala:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.catalyst
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.catalyst.analysis.UnresolvedAlias
+import org.apache.spark.sql.catalyst.expressions.{Expression, NamedExpression}
+
+/**
+ * A collection of implicit conversions that create a DSL for constructing 
connect protos.
+ *
+ * {{{
+ *   scala> import org.apache.spark.sql.connect.plans.DslLogicalPlan
+ *
+ *   // Standard way to construct connect proto
+ *   scala> import org.apache.spark.connect.proto
+ *   scala> :paste
+ *   // Entering paste mode (ctrl-D to finish)
+ *   val connectTestRelation =
+ *proto.Relation.newBuilder()
+ * .setRead(
+ *   proto.Read.newBuilder()
+ *   
.setNamedTable(proto.Read.NamedTable.newBuilder().addParts("student").build())
+ *   .build())
+ * .build()
+ *  // Exiting paste mode, now interpreting.
+ *connectTestRelation: org.apache.spark.connect.proto.Relation =
+ * read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *
+ *   // Now we can apply select on the proto relation above
+ *   scala> import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
+ *   scala> connectTestRelation.select(UnresolvedAttribute(Seq("id")))
+ *   res14: org.apache.spark.connect.proto.Relation =
+ *project {
+ *  input {
+ *read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *  }
+ *  expressions {
+ *unresolved_attribute {
+ *  parts: "id"
+ *}
+ *  }
+ *}
+ *
+ * }}}
+ *
+ */
+package object connect {
+
+  object plans { // scalastyle:ignore
+implicit class DslLogicalPlan(val logicalPlan: proto.Relation) {
+  def select(exprs: Expression*): proto.Relation = {
+val namedExpressions = exprs.map {
+  case e: NamedExpression => e
+  case e => UnresolvedAlias(e)
+}
+
+proto.Relation.newBuilder().setProject(
+  proto.Project.newBuilder()
+.setInput(logicalPlan)
+.addExpressions(
+  proto.Expression.newBuilder()
+.setUnresolvedAttribute(

Review Comment:
   I am not familiar with existing analyzer DSL enough but
   1. This DSL matches with the analyzer DSL API. So the inputs will be the 
same.
   2. Then I think yes, we gonna have more conversions. Function -> 
UnresolvedFunciton, Column -> UnresolvedAttribute, Literal -> proto defined 
literal, etc.
   
   With this PR merged in, when we start to work on API coverage, then more 
cases (various expressions) will be covered by adding more tests.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-27 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981922371


##
repl/pom.xml:
##
@@ -58,6 +58,11 @@
   spark-sql_${scala.binary.version}
   ${project.version}
 
+
+  org.apache.spark
+  spark-connect_${scala.binary.version}

Review Comment:
   done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-27 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981920306


##
connect/src/main/scala/org/apache/spark/sql/catalyst/connect/connect.scala:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.catalyst
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.catalyst.analysis.UnresolvedAlias
+import org.apache.spark.sql.catalyst.expressions.{Expression, NamedExpression}
+
+/**
+ * A collection of implicit conversions that create a DSL for constructing 
connect protos.
+ *
+ * {{{
+ *   scala> import org.apache.spark.sql.connect.plans.DslLogicalPlan
+ *
+ *   // Standard way to construct connect proto
+ *   scala> import org.apache.spark.connect.proto
+ *   scala> :paste
+ *   // Entering paste mode (ctrl-D to finish)
+ *   val connectTestRelation =
+ *proto.Relation.newBuilder()
+ * .setRead(
+ *   proto.Read.newBuilder()
+ *   
.setNamedTable(proto.Read.NamedTable.newBuilder().addParts("student").build())
+ *   .build())
+ * .build()
+ *  // Exiting paste mode, now interpreting.
+ *connectTestRelation: org.apache.spark.connect.proto.Relation =
+ * read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *
+ *   // Now we can apply select on the proto relation above
+ *   scala> import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
+ *   scala> connectTestRelation.select(UnresolvedAttribute(Seq("id")))
+ *   res14: org.apache.spark.connect.proto.Relation =
+ *project {
+ *  input {
+ *read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *  }
+ *  expressions {
+ *unresolved_attribute {
+ *  parts: "id"
+ *}
+ *  }
+ *}
+ *
+ * }}}
+ *
+ */
+package object connect {
+
+  object plans { // scalastyle:ignore
+implicit class DslLogicalPlan(val logicalPlan: proto.Relation) {
+  def select(exprs: Expression*): proto.Relation = {
+val namedExpressions = exprs.map {
+  case e: NamedExpression => e
+  case e => UnresolvedAlias(e)
+}
+
+proto.Relation.newBuilder().setProject(
+  proto.Project.newBuilder()
+.setInput(logicalPlan)
+.addExpressions(
+  proto.Expression.newBuilder()
+.setUnresolvedAttribute(

Review Comment:
   I am not familiar with existing analyzer DSL enough bu
   1. This DSL matches with the analyzer DSL API. So the inputs will be the 
same.
   2. Then I think yes, we gonna have more conversions. Function -> 
UnresolvedFunciton, Column -> UnresolvedAttribute, Literal -> proto defined 
literal, etc.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-27 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981556235


##
connect/src/main/scala/org/apache/spark/sql/connect/package.scala:
##
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql

Review Comment:
   I addressed the package refactoring comment.  Please let me know if you also 
want to revert the shell change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-27 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981520979


##
connect/src/main/scala/org/apache/spark/sql/connect/package.scala:
##
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.catalyst.analysis.UnresolvedAlias
+import org.apache.spark.sql.catalyst.expressions.{Expression, NamedExpression}
+
+
+/**
+ * A collection of implicit conversions that create a DSL for constructing 
connect protos.
+ *
+ * {{{
+ *   scala> import org.apache.spark.sql.connect.plans.DslLogicalPlan
+ *
+ *   // Standard way to construct connect proto
+ *   scala> import org.apache.spark.connect.proto
+ *   scala> :paste
+ *   // Entering paste mode (ctrl-D to finish)
+ *   val connectTestRelation =
+ *proto.Relation.newBuilder()
+ * .setRead(
+ *   proto.Read.newBuilder()
+ *   
.setNamedTable(proto.Read.NamedTable.newBuilder().addParts("student").build())
+ *   .build())
+ * .build()
+ *  // Exiting paste mode, now interpreting.
+ *connectTestRelation: org.apache.spark.connect.proto.Relation =
+ * read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *
+ *   // Now we can apply select on the proto relation above
+ *   scala> import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
+ *   scala> connectTestRelation.select(UnresolvedAttribute(Seq("id")))
+ *   res14: org.apache.spark.connect.proto.Relation =
+ *project {
+ *  input {
+ *read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *  }
+ *  expressions {
+ *unresolved_attribute {
+ *  parts: "id"
+ *}
+ *  }
+ *}
+ *
+ * }}}
+ *
+ */
+package object connect {
+
+  object plans { // scalastyle:ignore
+implicit class DslLogicalPlan(val logicalPlan: proto.Relation) {

Review Comment:
   In current codebase, there are `proto.Relation` and `proto.Command`. 
`proto.Relation` means for non-command logical plans.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-27 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981519878


##
connect/src/main/scala/org/apache/spark/sql/connect/package.scala:
##
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.catalyst.analysis.UnresolvedAlias
+import org.apache.spark.sql.catalyst.expressions.{Expression, NamedExpression}
+
+
+/**
+ * A collection of implicit conversions that create a DSL for constructing 
connect protos.
+ *
+ * {{{
+ *   scala> import org.apache.spark.sql.connect.plans.DslLogicalPlan
+ *
+ *   // Standard way to construct connect proto
+ *   scala> import org.apache.spark.connect.proto
+ *   scala> :paste
+ *   // Entering paste mode (ctrl-D to finish)
+ *   val connectTestRelation =
+ *proto.Relation.newBuilder()
+ * .setRead(
+ *   proto.Read.newBuilder()
+ *   
.setNamedTable(proto.Read.NamedTable.newBuilder().addParts("student").build())
+ *   .build())
+ * .build()
+ *  // Exiting paste mode, now interpreting.
+ *connectTestRelation: org.apache.spark.connect.proto.Relation =
+ * read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *
+ *   // Now we can apply select on the proto relation above
+ *   scala> import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
+ *   scala> connectTestRelation.select(UnresolvedAttribute(Seq("id")))
+ *   res14: org.apache.spark.connect.proto.Relation =
+ *project {
+ *  input {
+ *read {
+ *  named_table {
+ *parts: "student"
+ *  }
+ *}
+ *  }
+ *  expressions {
+ *unresolved_attribute {
+ *  parts: "id"
+ *}
+ *  }
+ *}
+ *
+ * }}}
+ *
+ */
+package object connect {
+
+  object plans { // scalastyle:ignore
+implicit class DslLogicalPlan(val logicalPlan: proto.Relation) {
+  def select(exprs: Expression*): proto.Relation = {

Review Comment:
   This shouldn't be `proto.Expression`.  In the near term I don't see a need 
for add `def select(exprs: proto.Expression*)`.
   
   We use this to generate proto and compare with normal code path of spark 
dataframe, the inputs here should be the same which are normal expressions. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-27 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r981517158


##
connect/src/main/scala/org/apache/spark/sql/connect/package.scala:
##
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql

Review Comment:
   @cloud-fan would you prefer that I revert the change for the shell (plus 
removing those java doc about how to use shell to play with this DSL)?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-26 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r980613293


##
connect/src/main/scala/org/apache/spark/sql/connect/package.scala:
##
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql

Review Comment:
   Are you actually saying move to `org.apache.spark.sql.catalyst` in connect 
module?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-26 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r980613106


##
connect/src/main/scala/org/apache/spark/sql/connect/package.scala:
##
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql

Review Comment:
   hmmm that will be a dependency issue? connect depends on sql but this DSL 
needs to depends on connect proto?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-26 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r980480469


##
connect/src/main/scala/org/apache/spark/sql/connect/package.scala:
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql
+
+import org.apache.spark.connect.proto
+

Review Comment:
   Ah I found that we need to add connect to the deps of the REPL module. 
   
   I made the change and also then update the documentation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-26 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r980404430


##
connect/src/main/scala/org/apache/spark/sql/connect/package.scala:
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql
+
+import org.apache.spark.connect.proto
+

Review Comment:
   I double-checked in case I did't include right DSL classes in the shell but 
just import a different connect class with same error:
   ```
   scala> import org.apache.spark.sql.connect.service.SparkConnectService
   :25: error: object connect is not a member of package 
org.apache.spark.sql
  import org.apache.spark.sql.connect.service.SparkConnectService
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-26 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r980309040


##
connect/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.connect.planner
+
+import org.scalatest.BeforeAndAfter
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.connect.plans.DslLogicalPlan
+import org.apache.spark.sql.test.SharedSparkSession
+
+class SparkConnectProtoSuite extends SparkFunSuite
+  with SharedSparkSession with SparkConnectPlanTest with PlanTest with 
BeforeAndAfter {
+
+  lazy val connectTestRelation =

Review Comment:
   I tried to have a function to initialize test data.
   
   Basically it creates tables and then execute those `lazy val`. In the future 
when test data expands we can then just expand this util function.
   
   Do you have better idea? 



##
connect/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.connect.planner
+
+import org.scalatest.BeforeAndAfter
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.connect.plans.DslLogicalPlan
+import org.apache.spark.sql.test.SharedSparkSession
+
+class SparkConnectProtoSuite extends SparkFunSuite

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia commented on a diff in pull request #37994: [SPARK-40454][CONNECT] Initial DSL framework for protobuf testing

2022-09-26 Thread GitBox


amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r980308023


##
connect/src/main/scala/org/apache/spark/sql/connect/package.scala:
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql
+
+import org.apache.spark.connect.proto
+

Review Comment:
   I changed to match what catalyst's DSL is doing. I also added  class comment.
   
   However when I try to document the usage guide, I found the import that 
works in test code which does not work for spark-shell
   
   ```
   
   scala> import org.apache.spark.sql.connect.plans.DslLogicalPlan
   :23: error: object connect is not a member of package 
org.apache.spark.sql
  import org.apache.spark.sql.connect.plans.DslLogicalPlan
   ```
   
I am thinking this is a part of cleanup work that we need to do. Can I 
create a follow up JIRA to fix this then go back to document the DSL usage next 
time?  
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org