Elaborating on Ido's statement about 'what you want to do with it', I like
to think about the structure in terms of the queries you plan to perform on
it. If you are not going to perform queries on those properties, you should
go for the simplest structure, which is just to have properties, or not even
store the information at all. If you plan to perform queries, then the
diversity of the values is a good hint as to how to structure it. If you
have very diverse values, or many possible values, then use a complex index,
like the built-in lucene index, or build your own tree-graph index. If you
have a more discrete value set, then the category index described by Anders
and Ido is a good solution, will perform faster than a complex generic index
(like lucene) and will also give a graph structure that looks (IMHO) more
like your mental data model does (i.e. I assume that if you have limited
categories and are likely to query using those, then that is the natural
structure of the data model :-)

On Sat, Dec 25, 2010 at 7:48 PM, Ido Ran <[email protected]> wrote:

> Hi,
> I like your discussion and I have something to add:
>
> The most important question when building a model (be it table schema or
> graph schema) is what do you want to do with it? What answers should I
> provide?
>
> If you take the categories in graph database a step further it will not
> store the properties of each product as simple property on the node but you
> will have a Relation between the node representing the Product to node
> representing the Attribute and on the relation you will store the actual
> property value.
> This way you can answer question like which attribute are set to each
> product, which product also have this attribute and more using simple
> travers over the graph.
>
> This is the way I think the model can
> be<http://img98.imageshack.us/i/shopcategorieserd.png/>
>
> I hope it helps,
> Ido
>
>
>
> On Sat, Dec 25, 2010 at 12:44 PM, Anders Nawroth
> <[email protected]>wrote:
>
> > Hi!
> >
> > > "What can't be expressed nicely in the ER-Diagram are the attribute
> > > values, as the actual names of those attributes are defined as data
> > > elsewhere in the model. This mix of metadata and data may be a problem
> > > when using other underlying data models, but for a graph database,
> > > this is actually how it's supposed to be used."
> > >
> > > Are you saying that you should generally store attribute data
> > > individually in separate nodes rather than storing the data as node
> > > properties?
> >
> > No, the model has two levels. It stores metadata like if an attribute is
> > mandatory and it's value range in separate nodes/relationships. For
> > example, the metadata may define that all TV sets must have a screen
> > size. The actual values are then stored directly on the nodes.
> >
> > So if we look at the bottom of the ER diagram:
> >
> >
> https://github.com/neo4j-examples/java-shop-categories/raw/master/src/main/model/shop-categories-erd.png
> > the Product would normally have its attributes defined, and in this case
> > we put [AttributeValues] there instead, as the attribute names (keys)
> > are not known.
> >
> > By traversing from the node of a product, we can determine what
> > attributes it can have, what attributes it must have and so on.
> >
> > The point is that it's quite easy to have both flexibility and full type
> > safety when using a graph data model. Both data and metadata can be
> > stored in the same manner, and be directly connected to each other.
> >
> > Does this make sense to you?
> >
> >
> > /anders
> >
> >
> > >
> > > For example, here is a blog entry with its data contained in one node,
> > > using the node-property key/value pairs:
> > >
> > >     entry = graphdb.node(
> > >          Title="Some Title",
> > >          Summary="",
> > >          Body="bla bla bla",
> > >          langauge="US",
> > >          Created=2010-12-24,
> > >          Published=2010-12-24,
> > >          Revised=2010-12-24)
> > >
> > > Alternatively, you could set it up so that the attributes in separate
> > nodes:
> > >
> > > attribute_subref_node = Subreference.Node.ATTRIBUTE_ROOT(graphdb)
> > > attribute_subref_node.ATTRIBUTE_TYPE(title_node)
> > > attribute_subref_node.ATTRIBUTE_TYPE(summary_node)
> > > attribute_subref_node.ATTRIBUTE_TYPE(body_node)
> > > attribute_subref_node.ATTRIBUTE_TYPE(language_node)
> > > attribute_subref_node.ATTRIBUTE_TYPE(published_date_node)
> > > attribute_subref_node.ATTRIBUTE_TYPE(revision_date_node)
> > >
> > >
> > > When should you choose one way over the other?
> > >
> > > Thanks.
> > >
> > > James
> > > _______________________________________________
> > > Neo4j mailing list
> > > [email protected]
> > > https://lists.neo4j.org/mailman/listinfo/user
> > _______________________________________________
> > Neo4j mailing list
> > [email protected]
> > https://lists.neo4j.org/mailman/listinfo/user
> >
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to