I have implemented initial draft based on suggestions. Jira ISSUE: S2GRAPH-131 <https://issues.apache.org/jira/browse/S2GRAPH-131> and PR: pull101 <https://github.com/apache/incubator-s2graph/pull/101>. Please review this and give feedback on what you guys suggested. I think I understood the VertexLabel parts, but not sure about other suggestions. More than welcome to comment or send PR.
Other than above issue, I want to start discussion on the Features can be provided by S2Graph. This is actually done through override ElementFeatures interface by provide(as far as I understand), so we need to discuss what features are available on S2Graph. DataTypeFeatures EdgeFeatures EdgePropertyFeatures ElementFeatures GraphFeatures PropertyFeatures VariableFeatures VertexFeatures VertexPropertyFeatures I am not an expert on TinkerPop3 so I need more time to ingest what these features are. It would be appreciate if anyone who have idea what they are and what kind of features S2Graph can provide. Thanks. On Mon, Nov 28, 2016 at 11:20 AM daewon <[email protected]> wrote: > Hongki's suggested is most intuitive. > +1 > > On Mon, Nov 28, 2016 at 11:08 AM Hyunsung Jo <[email protected]> > wrote: > > Hi All, > From the three options that Hongki suggested, using S2Graph serviceName and > columnName as TinkerPop3 Vertex Label seems like the most intuitive. So +1 > on that. > > Regards, > Jo > > On Fri, Nov 25, 2016 at 12:50 PM hongki kim <[email protected]> > wrote: > > > I concluded my thoughts briefly, and I agree that the details are > lacking. > > Let me write a little more detail. > > > > > > - The Vertex attribute allows individual filtering but is inconvenient > > I enter the code to encourage this props is not natural, I think. > > Serialize / Deserialize will cost more. > > The following example is an individual filtering of the ServiceName and > > ColumnName I think. > > Ex) g.V (). Has ( "serviceName", "gogo") > > > > => I think that Vertex Property uses java.util.Map, and the string > > "serviceName" and "columName" should always be serialized together. Of > > course, you can also use "_s" or "_c". And I worry that it will be used > as > > a reserved Property Key with only s2graph. > > > > > > - If you put a serviceColumn in the id, it will be cumbersome to compare > > The id value itself or to the operation like "summary". > > > > > > => My English skills were not very good, so I guess it was not accurate. > > What I was concerned about was that when I set up several vertex ids, I > was > > worried that gremlin would list them with the following expression: > > Ex) g.V(new S2VertexId (1,"gogo", "userid"), new S2VertexId (2,"gogo", > > "userid") ) > > > > Also, I was wondering if the following operation would work in the > summary > > part. The idea is that the S2VertexId object needs to return the actual > id. > > Ex) g.V().id().sum() #tp3 > > g.V().id().id().sum() # s2graph (with S2VertexId) > > > > > > > > Thanks for reading > > > > > > > > > > > > 2016-11-24 22:57 GMT+09:00 DO YUNG YOON <[email protected]>: > > > > > Hi 김홍기 and Welcome to the S2Graph. > > > > > > It seems that you already know internal model that S2Graph is using but > > let > > > me explain it for others who are not familiar with it to discuss issue > 3 > > in > > > more detail. > > > > > > Instead of convert user provided Id into internal unique numeric Id, > > > S2Graph simply composite service and column metadata with user provided > > Id > > > to guarantee global unique Id. > > > Here are some important notation. > > > > > > 1. Service - the top level abstraction > > > - A convenient logical grouping of related entities > > > - Similar to the database abstraction that most relational databases > > > support. > > > > > > 2. Column - belongs to a service. > > > - A set of homogeneous vertices such as users, news articles or tags. > > > - Every vertex has a user-provided unique ID that allows the > efficient > > > lookup. > > > - A service typically contains multiple columns. > > > > > > 3. Label - schema for edge > > > - A set of homogeneous edges such as friendships, views, or clicks. > > > - Relation between two columns as well as a recursive association > > within > > > one column. > > > - The two columns connected with a label may not necessarily be in > the > > > same service, allowing us to store and query data that spans over > > multiple > > > services. > > > > > > > > > From your suggestion, here is what I thought. > > > > > > 1. Put the ServiceName and columnName in the vertex property > > > ex) graph.addVertex(T.Id, 1, "serviceName". "gogo", columnName , > "user") > > > > > > 2. Replace tp3 id with S2graph VertexId (ServiceName, columnName, id) > > > ex) graph.addVertex(T.Id, new S2VertexId(1, "gogo", "user")) > > > > > > 3. Use Vertex Label > > > ex) graph.addVertex(T.label, "gogo::user", T.id, 1 ) > > > > > > First of all, I think if we ignore Service and Column on vertex, then > > there > > > is no way to guarantee the global uniqueness of id 1. If my service has > > > user id 1 and your service has user id 1, then there is no way to > > > distinguish > > > same 1 without Service and Column, so I think ignoring them is not an > > > option for us. > > > > > > I am +1 on Use Vertex Label to map S2Graph's Service, Column notation > > into > > > tp3 in general and forcing user to provide vertex label on vertex. > > > > > > Concatenate serviceName and columnName with "::" does not looks like > best > > > for me, but it should be fine, since client users will never need to > > > separate serviceName and columnName from given string. existing users > of > > > Tinkerpop should be familiar with vertex label notation so I think it > is > > > best option. > > > > > > Also I think being more explicit also make sense, so option 2 seems ok > > for > > > me too. users can be notified by thrown exception if they not provide > > > S2VertexId type for T.id value, but this is not as familiar as > > vertexLabel. > > > > > > By the way, I don't quite understand the reasons you mentioned below, > so > > > can you please elaborate them one more time? > > > > > > - Using the vertex property allows individual filtering but is > > inconvenient > > > to input, and the code that forces this props is unnatural and I think > it > > > will cost more in Serialize / Deserialize > > > - If you put a serviceColumn in the id, it will be cumbersome to > compare > > > the id value itself or to perform the operation like "summary". > > > > > > Thanks for your suggestion and participation in this community. > > > > > > Folks, Any more thought? > > > > > > On Thu, Nov 24, 2016 at 8:05 PM 김홍기 <[email protected]> wrote: > > > > > > > hi, > > > > > > > > I think about issue 3, as below. > > > > > > > > ServiceColumn does not exist in tp3 but exists only in s2graph. > > > > There are several ways to apply to tp3. > > > > > > > > - Put the ServiceName and columnName in the vertex property > > > > ex) graph.addVertex(T.Id, 1, "serviceName". "gogo", columnName , > > "user") > > > > > > > > - Replace tp3 id with S2graph VertexId (ServiceName, columnName, id) > > > > ex) graph.addVertex(T.Id, new S2VertexId(1, "gogo", "user")) > > > > > > > > - Use Vertex Label > > > > ex) graph.addVertex(T.label, "gogo::user", T.id, 1 ) > > > > > > > > - Ignore ServiceColumn on Vertex because edge label contains a > > > > ServiceColumn relationship > > > > ex) greaph.addVertex(T.id, 1) > > > > > > > > > > > > I think VertexLabel is a good choice for some reason. > > > > > > > > - Using the vertex property allows individual filtering but is > > > inconvenient > > > > to input, and the code that forces this props is unnatural and I > think > > it > > > > will cost more in Serialize / Deserialize > > > > - If you put a serviceColumn in the id, it will be cumbersome to > > compare > > > > the id value itself or to perform the operation like "summary". > > > > - Ignoring the service column makes it easier to access different > types > > > of > > > > edges, but there is a problem of data integrity > > > > > > > > > > > > > > > > > > > > > ---------------------------------------------------------------------- > > > > > > > > > > > > > > > > From: DO YUNG YOON <[email protected]> > > > > To: s2graph-dev <[email protected]> > > > > Cc: > > > > Date: Thu, 24 Nov 2016 03:44:36 +0000 > > > > Subject: [DISCUSS] Support Apache TinkerPop and Gremlin > > > > Hi folks. > > > > > > > > After discussion at ApacheCon BigData Europe(sevile), I was wondering > > if > > > it > > > > is possible to change S2Graph's core library to implement tp3's > > interface > > > > directly rather than providing layer atop of existing codebase. > > > > > > > > I have updated corresponding issue > > > > <https://issues.apache.org/jira/browse/S2GRAPH-72> and create 2 sub > > > tasks( > > > > S2GRAPH-129 <https://issues.apache.org/jira/browse/S2GRAPH-129> , > > > > S2GRAPH-130 <https://issues.apache.org/jira/browse/S2GRAPH-130> ) to > > try > > > > out this idea. > > > > > > > > @committers, Please review PR99 > > > > <https://github.com/apache/incubator-s2graph/pull/99>, PR100 > > > > <https://github.com/apache/incubator-s2graph/pull/100> so we can > > proceed > > > > to > > > > implement all interfaces of tp3 actually. I intentionally left actual > > > > implementation omitted because it can be changed after this > discussion. > > > > > > > > Apart from that, Here are few things I want to discuss regarding > > support > > > > Apache TinkerPop and Gremlin. > > > > > > > > 1. Data type of property value. Currently S2Graph only support types > > > > available on JSON. is this ok? are we going to support any other > type? > > If > > > > then, What need to be done to support other data type on property's > > > value. > > > > > > > > 2. No notion of VertexProperty. Property is same on Vertex and Edge > in > > > > S2Graph so we have to decide what's our S2VertexProperty would be. > Are > > we > > > > going to support this or just say we can't provide it(for now or > what). > > > > > > > > 3. Vertex Id: S2Graph use ServiceColumn + UserProvidedId as internal > > > vertex > > > > Id. We need to decide how we are going to map ServiceColumn into > tp3's > > > > Verte. Are we going to serialize/deserialize ServiceColumn into tp3's > > > > Vertex label or not? Not just about ServiceColumn but want to discuss > > > > further about what S2Graph are going to provide through tp3's > interface > > > and > > > > how. > > > > > > > > Please feel free to comment on not only above but also anything > > regarding > > > > to tp3 support in general. > > > > > > > > Thanks. > > > > > > > > > >
