Yes, I would only create every word node once. And then link the sentence
structures.
In general, just finding all the word nodes is probably not your end-goal
or?

Best ask here Community Site & Forum <https://community.neo4j.com> in the
Modeling and Cypher categories.


On Tue, Oct 9, 2018 at 11:00 PM John Carlo <johncarlof1...@gmail.com> wrote:

> Hello all,
>
> I've been using Neo4j for some weeks and I think it's awesome.
>
> I'm building an NLP application, and basically, I'm using Neo4j for
> storing the dependency graph generated by a semantic parser, something like
> this:
>
> https://explosion.ai/demos/displacy?text=Hi%20dear%2C%20what%20is%20your%20name%3F&model=en_core_web_sm&cpu=1&cph=0
>
> In the nodes, I store the single words contained in the sentences, and I
> connect them through relations with a number of different types.
>
> For my application, I have the requirement to find all the nodes that
> contain a given word, so basically I have to search through all the nodes,
> finding those that contain the input word.  Of course, I've already created
> an index on the word text field.
>
> I'm working on a very big dataset (by the way, the CSV importer is a great
> thing).
>
> On my laptop, the following query takes about 20 ms
> *MATCH (t:token) WHERE t.text="avoid" RETURN t.text*
>
> Here are the details of the graph.db:
> 47.108.544 nodes
>
> *45.442.034 relationships*
>
> *13.39 GiB db size*
> *Index created on token.text field*
>
> PROFILE MATCH (t:token) WHERE t.text="switch" RETURN t.text
> ------------------------
> NodeIndexSeek
> 251,679 db hits
> ---------------
> Projection
> 251,678 db hits
> --------------
> ProduceResults
> 251,678 db hits
>
> I wonder if I'm doing something wrong in indexing such amount of nodes. At
> the moment, I create a new node for each word I encounter in the text, even
> if the text is the same of other nodes.
>
> Should I create a new node only when a new word is encountered, managing
> the sentence structures through relationships?
>
> Could you please help me with a suggestion or best practice to adopt for
> this specific case? I think that Neo4j is a great piece of software and I'd
> like to make the most out of it :-)
>
> Thank you very much
>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to neo4j+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to