Sir, i am in search of a data set in which i can find hidden facts like panama leak ,please suggest me similar big data set .
On Wed, Oct 10, 2018 at 7:34 PM John Carlo <johncarlof1...@gmail.com> wrote: > Hello Michael, > > thank your for your reply. > > I've re-implemented the db structure using unique words/nodes, now the > number of nodes dropped from 47.108.544 to 1.934.049 > > I still have a huge number of relationships, 45.442.034 that now point to > the unique nodes, and the query are slow. > > My end goal is to find specific patterns in sentence structures, like the > following example > > (John)-[ACTION ]->(eat)-[SUBJECT]->(apple) > > Any suggestion will be appreciated > > thank you very much > > Il giorno mercoledì 10 ottobre 2018 00:50:22 UTC+2, Michael Hunger ha > scritto: >> >> Yes, I would only create every word node once. And then link the sentence >> structures. >> In general, just finding all the word nodes is probably not your end-goal >> or? >> >> Best ask here Community Site & Forum <https://community.neo4j.com> in >> the Modeling and Cypher categories. >> >> >> On Tue, Oct 9, 2018 at 11:00 PM John Carlo <johncar...@gmail.com> wrote: >> >>> Hello all, >>> >>> I've been using Neo4j for some weeks and I think it's awesome. >>> >>> I'm building an NLP application, and basically, I'm using Neo4j for >>> storing the dependency graph generated by a semantic parser, something like >>> this: >>> >>> https://explosion.ai/demos/displacy?text=Hi%20dear%2C%20what%20is%20your%20name%3F&model=en_core_web_sm&cpu=1&cph=0 >>> >>> In the nodes, I store the single words contained in the sentences, and I >>> connect them through relations with a number of different types. >>> >>> For my application, I have the requirement to find all the nodes that >>> contain a given word, so basically I have to search through all the nodes, >>> finding those that contain the input word. Of course, I've already created >>> an index on the word text field. >>> >>> I'm working on a very big dataset (by the way, the CSV importer is a >>> great thing). >>> >>> On my laptop, the following query takes about 20 ms >>> *MATCH (t:token) WHERE t.text="avoid" RETURN t.text* >>> >>> Here are the details of the graph.db: >>> 47.108.544 nodes >>> >>> *45.442.034 relationships* >>> >>> *13.39 GiB db size* >>> *Index created on token.text field* >>> >>> PROFILE MATCH (t:token) WHERE t.text="switch" RETURN t.text >>> ------------------------ >>> NodeIndexSeek >>> 251,679 db hits >>> --------------- >>> Projection >>> 251,678 db hits >>> -------------- >>> ProduceResults >>> 251,678 db hits >>> >>> I wonder if I'm doing something wrong in indexing such amount of nodes. >>> At the moment, I create a new node for each word I encounter in the text, >>> even if the text is the same of other nodes. >>> >>> Should I create a new node only when a new word is encountered, managing >>> the sentence structures through relationships? >>> >>> Could you please help me with a suggestion or best practice to adopt for >>> this specific case? I think that Neo4j is a great piece of software and I'd >>> like to make the most out of it :-) >>> >>> Thank you very much >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Neo4j" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to neo4j+un...@googlegroups.com. >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to neo4j+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.