tengqm commented on code in PR #6977:
URL: https://github.com/apache/gravitino/pull/6977#discussion_r2052115593


##########
docs/open-api/lineage.yaml:
##########
@@ -0,0 +1,281 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+
+openapi: 3.0.2
+info:
+  title: OpenLineage
+  version: 2-0-2
+  description: OpenLineage is an open source **lineage and metadata collection 
API** for the data ecosystem.
+  license:
+    name: Apache 2.0
+    url: http://www.apache.org/licenses/LICENSE-2.0.html
+paths:
+  /api/lineage:
+    post:
+      summary: Send an event related to the state of a run
+      description: Updates a run state for a job.
+      operationId: postRunEvent
+      tags:
+        - OpenLineage
+      requestBody:
+        content:
+          application/json:
+            schema:
+              oneOf:
+                - $ref: '#/components/schemas/RunEvent'
+                - $ref: '#/components/schemas/DatasetEvent'
+                - $ref: '#/components/schemas/JobEvent'
+      responses:
+        200:
+          description: OK
+components:
+  schemas:
+    BaseEvent:
+      type: object
+      properties:
+        eventTime:
+          description: the time the event occurred at
+          type: string
+          format: date-time
+        producer:
+          description: URI identifying the producer of this metadata. For 
example this could be a git url with a given tag or sha
+          type: string
+          format: uri
+          example: 
https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client
+        schemaURL:
+          description: The JSON Pointer (https://tools.ietf.org/html/rfc6901) 
URL to the corresponding version of the schema definition for this RunEvent
+          type: string
+          format: uri
+          example: https://openlineage.io/spec/0-0-1/OpenLineage.json
+      required:
+        - eventTime
+        - producer
+        - schemaURL
+    BaseFacet:
+      description: all fields of the base facet are prefixed with _ to avoid 
name conflicts in facets
+      type: object
+      properties:
+        _producer:
+          description: URI identifying the producer of this metadata. For 
example this could be a git url with a given tag or sha
+          type: string
+          format: uri
+          example: 
https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client
+        _schemaURL:
+          description: The JSON Pointer (https://tools.ietf.org/html/rfc6901) 
URL to the corresponding version of the schema definition for this facet
+          type: string
+          format: uri
+          example: 
https://openlineage.io/spec/1-0-2/OpenLineage.json#/$defs/BaseFacet
+      additionalProperties: true

Review Comment:
   This ... is a little bit dangerous ...
   We don't want to validate this object anyway, right?
   



##########
docs/open-api/lineage.yaml:
##########
@@ -0,0 +1,281 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+
+openapi: 3.0.2
+info:
+  title: OpenLineage
+  version: 2-0-2
+  description: OpenLineage is an open source **lineage and metadata collection 
API** for the data ecosystem.
+  license:
+    name: Apache 2.0
+    url: http://www.apache.org/licenses/LICENSE-2.0.html
+paths:
+  /api/lineage:
+    post:
+      summary: Send an event related to the state of a run
+      description: Updates a run state for a job.
+      operationId: postRunEvent
+      tags:
+        - OpenLineage
+      requestBody:
+        content:
+          application/json:
+            schema:
+              oneOf:
+                - $ref: '#/components/schemas/RunEvent'
+                - $ref: '#/components/schemas/DatasetEvent'
+                - $ref: '#/components/schemas/JobEvent'
+      responses:
+        200:
+          description: OK
+components:
+  schemas:
+    BaseEvent:
+      type: object
+      properties:
+        eventTime:
+          description: the time the event occurred at
+          type: string
+          format: date-time
+        producer:
+          description: URI identifying the producer of this metadata. For 
example this could be a git url with a given tag or sha
+          type: string
+          format: uri
+          example: 
https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client
+        schemaURL:
+          description: The JSON Pointer (https://tools.ietf.org/html/rfc6901) 
URL to the corresponding version of the schema definition for this RunEvent
+          type: string
+          format: uri
+          example: https://openlineage.io/spec/0-0-1/OpenLineage.json
+      required:
+        - eventTime
+        - producer
+        - schemaURL
+    BaseFacet:
+      description: all fields of the base facet are prefixed with _ to avoid 
name conflicts in facets
+      type: object
+      properties:
+        _producer:
+          description: URI identifying the producer of this metadata. For 
example this could be a git url with a given tag or sha
+          type: string
+          format: uri
+          example: 
https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client
+        _schemaURL:
+          description: The JSON Pointer (https://tools.ietf.org/html/rfc6901) 
URL to the corresponding version of the schema definition for this facet
+          type: string
+          format: uri
+          example: 
https://openlineage.io/spec/1-0-2/OpenLineage.json#/$defs/BaseFacet
+      additionalProperties: true
+      required:
+        - _producer
+        - _schemaURL
+    RunFacet:
+      description: A Run Facet
+      type: object
+      allOf:
+        - $ref: '#/components/schemas/BaseFacet'
+    Run:
+      type: object
+      properties:
+        runId:
+          description: The globally unique ID of the run associated with the 
job.
+          type: string
+          format: uuid
+        facets:
+          description: The run facets.
+          type: object
+          anyOf:

Review Comment:
   Why this `anyOf` ?
   



##########
docs/open-api/lineage.yaml:
##########
@@ -0,0 +1,281 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+
+openapi: 3.0.2
+info:
+  title: OpenLineage
+  version: 2-0-2
+  description: OpenLineage is an open source **lineage and metadata collection 
API** for the data ecosystem.
+  license:
+    name: Apache 2.0
+    url: http://www.apache.org/licenses/LICENSE-2.0.html
+paths:
+  /api/lineage:
+    post:
+      summary: Send an event related to the state of a run
+      description: Updates a run state for a job.
+      operationId: postRunEvent
+      tags:
+        - OpenLineage
+      requestBody:
+        content:
+          application/json:
+            schema:
+              oneOf:
+                - $ref: '#/components/schemas/RunEvent'
+                - $ref: '#/components/schemas/DatasetEvent'
+                - $ref: '#/components/schemas/JobEvent'
+      responses:
+        200:
+          description: OK
+components:
+  schemas:
+    BaseEvent:
+      type: object
+      properties:
+        eventTime:
+          description: the time the event occurred at
+          type: string
+          format: date-time
+        producer:
+          description: URI identifying the producer of this metadata. For 
example this could be a git url with a given tag or sha
+          type: string
+          format: uri
+          example: 
https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client
+        schemaURL:
+          description: The JSON Pointer (https://tools.ietf.org/html/rfc6901) 
URL to the corresponding version of the schema definition for this RunEvent
+          type: string
+          format: uri
+          example: https://openlineage.io/spec/0-0-1/OpenLineage.json
+      required:
+        - eventTime
+        - producer
+        - schemaURL
+    BaseFacet:
+      description: all fields of the base facet are prefixed with _ to avoid 
name conflicts in facets
+      type: object
+      properties:
+        _producer:
+          description: URI identifying the producer of this metadata. For 
example this could be a git url with a given tag or sha
+          type: string
+          format: uri
+          example: 
https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client
+        _schemaURL:
+          description: The JSON Pointer (https://tools.ietf.org/html/rfc6901) 
URL to the corresponding version of the schema definition for this facet
+          type: string
+          format: uri
+          example: 
https://openlineage.io/spec/1-0-2/OpenLineage.json#/$defs/BaseFacet
+      additionalProperties: true
+      required:
+        - _producer
+        - _schemaURL
+    RunFacet:
+      description: A Run Facet
+      type: object
+      allOf:
+        - $ref: '#/components/schemas/BaseFacet'
+    Run:
+      type: object
+      properties:
+        runId:

Review Comment:
   You don't have to explicitly name the ID field of an object this way.
   Imagine this in your code,
   
   ```
   var run: Run;
   var thisId = run.runId;
   ```
   
   The ID has a crystal clear scope, which is the `Run`.
   Simply call it `id` is sufficient.
   
   ```
   var run: Run
   var thisID = run.id
   ```
   



##########
docs/open-api/lineage.yaml:
##########
@@ -0,0 +1,281 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+
+openapi: 3.0.2
+info:
+  title: OpenLineage
+  version: 2-0-2
+  description: OpenLineage is an open source **lineage and metadata collection 
API** for the data ecosystem.
+  license:
+    name: Apache 2.0
+    url: http://www.apache.org/licenses/LICENSE-2.0.html
+paths:
+  /api/lineage:
+    post:
+      summary: Send an event related to the state of a run
+      description: Updates a run state for a job.
+      operationId: postRunEvent
+      tags:
+        - OpenLineage
+      requestBody:
+        content:
+          application/json:
+            schema:
+              oneOf:
+                - $ref: '#/components/schemas/RunEvent'
+                - $ref: '#/components/schemas/DatasetEvent'
+                - $ref: '#/components/schemas/JobEvent'
+      responses:
+        200:
+          description: OK
+components:
+  schemas:
+    BaseEvent:
+      type: object
+      properties:
+        eventTime:
+          description: the time the event occurred at
+          type: string
+          format: date-time
+        producer:
+          description: URI identifying the producer of this metadata. For 
example this could be a git url with a given tag or sha
+          type: string
+          format: uri
+          example: 
https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client
+        schemaURL:
+          description: The JSON Pointer (https://tools.ietf.org/html/rfc6901) 
URL to the corresponding version of the schema definition for this RunEvent
+          type: string
+          format: uri
+          example: https://openlineage.io/spec/0-0-1/OpenLineage.json
+      required:
+        - eventTime
+        - producer
+        - schemaURL
+    BaseFacet:
+      description: all fields of the base facet are prefixed with _ to avoid 
name conflicts in facets
+      type: object
+      properties:
+        _producer:
+          description: URI identifying the producer of this metadata. For 
example this could be a git url with a given tag or sha
+          type: string
+          format: uri
+          example: 
https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client
+        _schemaURL:
+          description: The JSON Pointer (https://tools.ietf.org/html/rfc6901) 
URL to the corresponding version of the schema definition for this facet
+          type: string
+          format: uri
+          example: 
https://openlineage.io/spec/1-0-2/OpenLineage.json#/$defs/BaseFacet
+      additionalProperties: true
+      required:
+        - _producer
+        - _schemaURL
+    RunFacet:
+      description: A Run Facet
+      type: object
+      allOf:
+        - $ref: '#/components/schemas/BaseFacet'

Review Comment:
   a `RunFacet` is identical to a `BaseFacet`, why do we model the same API 
resource again?



##########
docs/open-api/lineage.yaml:
##########
@@ -0,0 +1,281 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+
+openapi: 3.0.2
+info:
+  title: OpenLineage
+  version: 2-0-2
+  description: OpenLineage is an open source **lineage and metadata collection 
API** for the data ecosystem.
+  license:
+    name: Apache 2.0
+    url: http://www.apache.org/licenses/LICENSE-2.0.html
+paths:
+  /api/lineage:
+    post:
+      summary: Send an event related to the state of a run
+      description: Updates a run state for a job.
+      operationId: postRunEvent
+      tags:
+        - OpenLineage
+      requestBody:
+        content:
+          application/json:
+            schema:
+              oneOf:
+                - $ref: '#/components/schemas/RunEvent'
+                - $ref: '#/components/schemas/DatasetEvent'
+                - $ref: '#/components/schemas/JobEvent'

Review Comment:
   
https://swagger.io/docs/specification/v3_0/data-models/inheritance-and-polymorphism/
   
   Polymorphism might be good for implementation code, but not for API design. 
It makes things very difficult to process. The *origin of the sin* lies in the 
`oneOf` stanza.
   I'd suggest we avoid doing this in the API design.
   
   If we have to do this (polymorphism), please consider add a 'discriminator'. 
Consider this, if you are implementing the server side handler for this API, 
how could you easily distinguish one event type from another?



##########
docs/open-api/lineage.yaml:
##########
@@ -0,0 +1,281 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+
+openapi: 3.0.2
+info:
+  title: OpenLineage
+  version: 2-0-2
+  description: OpenLineage is an open source **lineage and metadata collection 
API** for the data ecosystem.
+  license:
+    name: Apache 2.0
+    url: http://www.apache.org/licenses/LICENSE-2.0.html
+paths:
+  /api/lineage:
+    post:
+      summary: Send an event related to the state of a run
+      description: Updates a run state for a job.
+      operationId: postRunEvent
+      tags:
+        - OpenLineage
+      requestBody:
+        content:
+          application/json:
+            schema:
+              oneOf:
+                - $ref: '#/components/schemas/RunEvent'
+                - $ref: '#/components/schemas/DatasetEvent'
+                - $ref: '#/components/schemas/JobEvent'
+      responses:
+        200:
+          description: OK
+components:
+  schemas:
+    BaseEvent:
+      type: object
+      properties:
+        eventTime:
+          description: the time the event occurred at
+          type: string
+          format: date-time
+        producer:
+          description: URI identifying the producer of this metadata. For 
example this could be a git url with a given tag or sha
+          type: string
+          format: uri
+          example: 
https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client
+        schemaURL:
+          description: The JSON Pointer (https://tools.ietf.org/html/rfc6901) 
URL to the corresponding version of the schema definition for this RunEvent
+          type: string
+          format: uri
+          example: https://openlineage.io/spec/0-0-1/OpenLineage.json
+      required:
+        - eventTime
+        - producer
+        - schemaURL
+    BaseFacet:
+      description: all fields of the base facet are prefixed with _ to avoid 
name conflicts in facets

Review Comment:
   ```suggestion
         description: all fields of the base facet are prefixed with `_` to 
avoid name conflicts in facets.
   ```



##########
docs/open-api/lineage.yaml:
##########
@@ -0,0 +1,281 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+
+openapi: 3.0.2
+info:
+  title: OpenLineage
+  version: 2-0-2
+  description: OpenLineage is an open source **lineage and metadata collection 
API** for the data ecosystem.
+  license:
+    name: Apache 2.0
+    url: http://www.apache.org/licenses/LICENSE-2.0.html
+paths:
+  /api/lineage:

Review Comment:
   Another thought on this ...
   The RESTful API design convention is to have plurals of resources in the 
path.
   For the `POST` verb, most of the time it is about creating something.
   So ....
   
   1. I'm not sure if `lineage` is a suitable noun here for the resource 
collection, my guess is not. `lineage` can be an API prefix for differentiating 
this group of APIs from other APIs though.
   2. By reading the JSON body payload, it smells to me that we are creating 
"events". However, the description of the method says that it is updating the 
run state of a job. This is confusing.
   3. We have only one API for generating the "events", and we have no idea 
where those events are stored or processed. This API looks weird to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to