Hi, Lixue, Thanks for the reply.
For
> 1. YAML's format is more human-readable and easier to edit, which is a 
> significant advantage in scenarios where we frequently need to view or modify 
> configuration files. For example, to define a subgraph from an existing graph.
I do not agree that we should let user to edit the yaml/json files directly. 
Manual modification of schema files is unreliable and unpredictable that would 
probably bring error that users don't even know why. And that's why we gonna to 
provide a CLI to restrict the operations on graph data, including the project a 
subgraph.

for the human-readable, here is the ldbc-sample.graph.yml for YAML and JSON:
```
name: ldbc_sample
vertices:
  - person.vertex.yml
edges:
  - person_knows_person.edge.yml
version: gar/v1
extra_metadata: {}
```
```
{
  "name": "ldbc_sample",
  "vertices": [
    "person.vertex.yml"
  ],
  "edges": [
    "person_knows_person.edge.yml"
  ],
  "version": "gar/v1",
  "extra_metadata": {}
}
```
JSON is readable enough i think, but not configurable as YAML. But since the 
files are not allow modified directly, I think JSON is ok.

> 2. YAML often provides a more concise representation of the same data.

Can you give an example to show that why YAML provides more concise 
representation of the data.

> 3. YAML natively supports comments and extensions, making it more flexible.

I agree that YAML support more feature and more flexible. But it's too flexible 
that can not provide much template validation support. To GraphAr format, we 
should consider that if the format is enough to express the schema and 
configuration of GraphAr. In this point, JSON is good to me.


On 2024/05/10 01:47:47 "李雪(有理)" wrote:
> Thank you for the information and links provided. While I understand the 
> application of JSON in GraphScope Flex and its advantages when integrated 
> with GraphAr, considering our specific use case, I still think that YAML 
> might be a more suitable choice for us. Here are the primary reasons:
> 1. YAML's format is more human-readable and easier to edit, which is a 
> significant advantage in scenarios where we frequently need to view or modify 
> configuration files. For example, to define a subgraph from an existing graph.
> 2. YAML often provides a more concise representation of the same data.
> 3. YAML natively supports comments and extensions, making it more flexible.
> Therefore, we initially favored YAML over JSON. I hope we can further discuss 
> this topic to find the solution that best fits our project requirements.
> ------------------------------------------------------------------
> 发件人:Weibin Zeng <wei...@apache.org>
> 发送时间:2024年5月9日(星期四) 18:52
> 收件人:dev<dev@graphar.apache.org>
> 主 题:Re: [DISCUSS][format] Using an Interface Definition Language to define 
> GraphAr format
> Sorry, miss the link
> > GraphScope Flex now use json as communication format for graph schema and 
> > check with rest API[1]
> [1] 
> https://github.com/alibaba/GraphScope/tree/main/python/graphscope/flex/rest/models
>  
> <https://github.com/alibaba/GraphScope/tree/main/python/graphscope/flex/rest/models
>  >
> On 2024/05/09 10:49:24 Weibin Zeng wrote:
> > JSONs is ok for me. And GraphScope Flex now use json as communication 
> > format for graph schema and check with rest API[1], I think switching to 
> > JSON is good for GraphAr. Since GraphAr has been integrated into GraphScope.
> > 
> > On 2024/05/09 08:50:45 Sem wrote:
> > > I made a small research about that and it seems to me that classes,
> > > generated from protobuf are not serializable into another formats like
> > > yaml/json.
> > > 
> > > There is a 3d party project: https://github.com/krzko/proto2yaml 
> > > <https://github.com/krzko/proto2yaml > that
> > > provide such utility, but it does not look well maintained.
> > > 
> > > I see that there is an utility, provided by google. It allows
> > > conversion to JSON and from JSON in Java/Python (most probably, cpp
> > > too):
> > > 1.
> > > https://cloud.google.com/java/docs/reference/protobuf/latest/com.google.protobuf.util.JsonFormat
> > >  
> > > <https://cloud.google.com/java/docs/reference/protobuf/latest/com.google.protobuf.util.JsonFormat
> > >  >
> > > 2.
> > > https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html
> > >  
> > > <https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html
> > >  >
> > > 
> > > But not to/from YAML.
> > > 
> > > For me that question is important, because we need not only generate
> > > the code but resolve the question about serialization/deserialization.
> > > 
> > > 
> > > What do you think about using proto (binary messages) for underlaying
> > > communication format in the code and JSONs for human-readable
> > > representation on disk? Because it looks like only with switching from
> > > YAML to JSON we achieve all the benefits of using protobuf.
> > > 
> > > With JSON I see it like we just call once `fromJson` (via google
> > > protobuf util) to read the data and create proto classes from JSON info
> > > files and work with them. At the end we call `toJson` again to
> > > serialize messages back.
> > > 
> > > To achieve the same with YAML we need to use 3d-party and not well
> > > maintained library or support our own serialization/deserialization of
> > > proto messages (classes) to/from YAML for three languages (Python,
> > > Java, Cpp).
> > > 
> > > On Thu, 2024-05-09 at 15:27 +0800, weibin.zen wrote:
> > > > Hi, everyone,
> > > > 
> > > > I would like to propose that we should considering using an Interface
> > > > Definition Language(IDL) like Protobuf[1] for GraphAr format
> > > > definition.
> > > > Currently we use YAML to describe schema and metadata of graph, and
> > > > data storage with common format like CSV/Parquet. YAML
> > > > provide human-readable ability but it can not provide much
> > > > validation, version-controlled. And various programming languages
> > > > need
> > > > to parse them and check the validation by themself.
> > > > 
> > > > Using IDL to describe format would bring benefits like:
> > > > 
> > > > • provide a clear, standardized, language-agnostic format definition
> > > > that can be version-controlled, shared by libraries and make the
> > > > format consistent between implementations.
> > > > • The validation by protobuf can be directly use by our validation of
> > > > the schema, no need to let the libraries to implement the validation.
> > > > • Cross-language support, libraries can use the generated structure
> > > > as graph info directly.
> > > > 
> > > > 
> > > > This proposal is not replace the YAML with Protobuf. We still use
> > > > YAML as the final schema&metadata file for user readable, but with
> > > > IDL to maintaining a
> > > > robust and precis schema definition. It's kind a hybrid strategy to
> > > > accommondates both human and machine needs.
> > > > 
> > > > But Using IDL do bring some disadvantages, Sem has list some in the
> > > > comment of pr[2]:
> > > > 
> > > > • the generated code is huge and unreadable.
> > > > • the generated code may need to store in git.
> > > > • debugging is very hard.
> > > > 
> > > > 
> > > > Since this would be a huge change, and I want to hear the thoughts
> > > > about the proposal from you.
> > > > 
> > > > 
> > > > [1] https://protobuf.dev/ <https://protobuf.dev/ >
> > > > [2] https://github.com/apache/incubator-graphar/pull/475 
> > > > <https://github.com/apache/incubator-graphar/pull/475 >
> > > > 
> > > > Best
> > > > weibin.zen
> > > 
> > > 
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscr...@graphar.apache.org
> > > For additional commands, e-mail: dev-h...@graphar.apache.org
> > > 
> > > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@graphar.apache.org
> > For additional commands, e-mail: dev-h...@graphar.apache.org
> > 
> > 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@graphar.apache.org
> For additional commands, e-mail: dev-h...@graphar.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@graphar.apache.org
For additional commands, e-mail: dev-h...@graphar.apache.org

Reply via email to