Hi 김홍기 and Welcome to the S2Graph.

It seems that you already know internal model that S2Graph is using but let
me explain it for others who are not familiar with it to discuss issue 3 in
more detail.

Instead of convert user provided Id into internal unique numeric Id,
S2Graph simply composite service and column metadata with user provided Id
to guarantee global unique Id.
Here are some important notation.

1. Service - the top level abstraction
  - A convenient logical grouping of related entities
  - Similar to the database abstraction that most relational databases
support.

2. Column - belongs to a service.
  - A set of homogeneous vertices such as users, news articles or tags.
  - Every vertex has a user-provided unique ID that allows the efficient
lookup.
  - A service typically contains multiple columns.

3. Label - schema for edge
  - A set of homogeneous edges such as friendships, views, or clicks.
  - Relation between two columns as well as a recursive association within
one column.
  - The two columns connected with a label may not necessarily be in the
same service, allowing us to store and query data that spans over multiple
services.


>From your suggestion, here is what I thought.

1. Put the ServiceName and columnName in the vertex property
ex) graph.addVertex(T.Id, 1, "serviceName". "gogo", columnName , "user")

2. Replace tp3 id with S2graph VertexId (ServiceName, columnName, id)
ex) graph.addVertex(T.Id, new S2VertexId(1, "gogo", "user"))

3. Use Vertex Label
ex) graph.addVertex(T.label, "gogo::user", T.id, 1 )

First of all, I think if we ignore Service and Column on vertex, then there
is no way to guarantee the global uniqueness of id 1. If my service has
user id 1 and your service has user id 1, then there is no way to distinguish
same 1 without Service and Column, so I think ignoring them is not an
option for us.

I am +1 on Use Vertex Label to map S2Graph's Service, Column notation into
tp3 in general and forcing user to provide vertex label on vertex.

Concatenate serviceName and columnName with "::" does not looks like best
for me, but it should be fine, since client users will never need to
separate serviceName and columnName from given string. existing users of
Tinkerpop should be familiar with vertex label notation so I think it is
best option.

Also I think being more explicit also make sense, so option 2 seems ok for
me too. users can be notified by thrown exception if they not provide
S2VertexId type for T.id value, but this is not as familiar as vertexLabel.

By the way, I don't quite understand the reasons you mentioned below, so
can you please elaborate them one more time?

- Using the vertex property allows individual filtering but is inconvenient
to input, and the code that forces this props is unnatural and I think it
will cost more in Serialize / Deserialize
- If you put a serviceColumn in the id, it will be cumbersome to compare
the id value itself or to perform the operation like "summary".

Thanks for your suggestion and participation in this community.

Folks, Any more thought?

On Thu, Nov 24, 2016 at 8:05 PM 김홍기 <[email protected]> wrote:

> hi,
>
> I think about issue 3, as below.
>
> ServiceColumn does not exist in tp3 but exists only in s2graph.
> There are several ways to apply to tp3.
>
> - Put the ServiceName and columnName in the vertex property
> ex) graph.addVertex(T.Id, 1, "serviceName". "gogo", columnName , "user")
>
> - Replace tp3 id with S2graph VertexId (ServiceName, columnName, id)
> ex) graph.addVertex(T.Id, new S2VertexId(1, "gogo", "user"))
>
> - Use Vertex Label
> ex) graph.addVertex(T.label, "gogo::user", T.id, 1 )
>
> - Ignore ServiceColumn on Vertex because edge label contains a
> ServiceColumn relationship
> ex) greaph.addVertex(T.id, 1)
>
>
> I think VertexLabel is a good choice for some reason.
>
> - Using the vertex property allows individual filtering but is inconvenient
> to input, and the code that forces this props is unnatural and I think it
> will cost more in Serialize / Deserialize
> - If you put a serviceColumn in the id, it will be cumbersome to compare
> the id value itself or to perform the operation like "summary".
> - Ignoring the service column makes it easier to access different types of
> edges, but there is a problem of data integrity
>
>
>
>
> ----------------------------------------------------------------------
>
>
>
> From: DO YUNG YOON <[email protected]>
> To: s2graph-dev <[email protected]>
> Cc:
> Date: Thu, 24 Nov 2016 03:44:36 +0000
> Subject: [DISCUSS] Support Apache TinkerPop and Gremlin
> Hi folks.
>
> After discussion at ApacheCon BigData Europe(sevile), I was wondering if it
> is possible to change S2Graph's core library to implement tp3's interface
> directly rather than providing layer atop of existing codebase.
>
> I have updated corresponding issue
> <https://issues.apache.org/jira/browse/S2GRAPH-72> and create 2 sub tasks(
> S2GRAPH-129 <https://issues.apache.org/jira/browse/S2GRAPH-129> ,
> S2GRAPH-130 <https://issues.apache.org/jira/browse/S2GRAPH-130> ) to try
> out this idea.
>
> @committers, Please review PR99
> <https://github.com/apache/incubator-s2graph/pull/99>, PR100
> <https://github.com/apache/incubator-s2graph/pull/100> so we can proceed
> to
> implement all interfaces of tp3 actually. I intentionally left actual
> implementation omitted because it can be changed after this discussion.
>
> Apart from that, Here are few things I want to discuss regarding support
> Apache TinkerPop and Gremlin.
>
> 1. Data type of property value. Currently S2Graph only support types
> available on JSON. is this ok? are we going to support any other type? If
> then, What need to be done to support other data type on property's value.
>
> 2. No notion of VertexProperty. Property is same on Vertex and Edge in
> S2Graph so we have to decide what's our S2VertexProperty would be. Are we
> going to support this or just say we can't provide it(for now or what).
>
> 3. Vertex Id: S2Graph use ServiceColumn + UserProvidedId as internal vertex
> Id. We need to decide how we are going to map ServiceColumn into tp3's
> Verte. Are we going to serialize/deserialize ServiceColumn into tp3's
> Vertex label or not? Not just about ServiceColumn but want to discuss
> further about what S2Graph are going to provide through tp3's interface and
> how.
>
> Please feel free to comment on not only above but also anything regarding
> to tp3 support in general.
>
> Thanks.
>

Reply via email to