that's so unfortunate I saw this message only now... awww.... I tried to contact neo for a meeting at their headquarter but it didn't work out, I would have definitely come. I had spoken with biz dev department, and likely they had been keen to meet up if there would have been a biz development opportunity with an already promising startup, rather than helping a single dude to turn a prototype into a beta :)
So I resolved to dedicate my little time in SF (you know, ESTA touristic visa) not to code and develop, rather to get feedback on a first product I crafted (not yet supporting a graph db) and to meet with people and gather info for moving back there .. lovely and intense city. .. now in Italy, I am working back on this aspect of full-text indexing, and yes I did follow your link but I can't get meaningful results with full-text. I am finding it very difficult aspect: in the documentation there are examples but it s not clear which also works on cypher and which not (e.g. http://stackoverflow.com/questions/10140885/sorting-neo4js-lucene-index-queries-in-cypher). Other tutorial shed some light in java http://blog.armbruster-it.de/2014/10/deep-dive-on-fulltext-indexing-with-neo4j/ but still did not manage - I see there is some support in neo4j-rest-client for python for supporting lucene syntax, but my concern is about the quality of hit results and need a better understanding on how results are matched in a START query in cypher - or do a full-text query in python. I also posted in stack-overflow: http://stackoverflow.com/questions/31862761/how-to-handle-thousands-of-rows-with-similar-string-patterns-in-neo4j-full-text *Issue: *full-text index in cypher START look of poor quality: examples using topics name from wikipedia as test: - 'DNA' won't hit a record with the single word 'DNA' as first results - 'united states' (note, two words) will find the actual record 'united states' buried somewhere deep down a list of 11K rows; first one match is a long name where there are words 'united states' too, but that's not meaningful (e.g. 'List of something here and there that happened in united states some while ago' :) That is a *major problem*, because you* cannot paginate the results with a logical sense*: in order to return meaningful results to the user, you need to fetch all nodes and then apply some sorting (levenshtein as example). It is not clear how the results are sorted in lucene/neo, although, reading the neo4j documentation and here: http://stackoverflow.com/questions/10140885/sorting-neo4js-lucene-index-queries-in-cypher it looks that lucene should handle the sorting itself: which logic? Results above seems to be kinda randomly hit. To clarify: I read about levenshtein metrics in lucene documentation, and for sure results fetch by a simple: *start n=node:topic('name:(DNA)') return n skip 0 limit 10;* do not use levenshtein in my case. As test, I tried *fuzzy searches*: *start n=node:topic('name:DNA~0.4') return n skip 0 limit 10;* and results do not change. As a second test, I can see that: *http://localhost:7474/db/data/index/node/topic?query=name:%22dna%22* return results as in the cypher query before, *but* syntax :* http://localhost:7474/db/data/index/node/topic/dna* returns *No index hits* I am trying to understand if I am writing wrong the syntax, either there is an error - maybe still is related to a batch import (you once told me there was a n error in indexing)? but it would not make much sense, since START actually find results which are pertinent (keywords values are inside the 'name' key), but not meaningul (values are 'distant' from the input keywords). My goal is to find a best match of first results; I am able to return decent results by applying levenshtein by my self *after* results are returned by a START query, but that, as said, is not feasible for matches with thousands of rows. *Could you please provide a brief guide or mockup db to test if my indexes are maybe corrupted, or a benchmark / guide to test how lucene START matches and sort results, possibly in python - not java -?* Il giorno martedì 12 maggio 2015 11:06:02 UTC+2, Michael Hunger ha scritto: > > Hi Luigi, > > did you try the fulltext lucene index approach that we discussed back > then? Can you share your latest approach? > I presume you do regexp search which is not using an index? > > I wrote that blog post a while ago, which is still valid for 2.2 > http://jexp.de/blog/2014/03/full-text-indexing-fts-in-neo4j-2-0/ > <http://www.google.com/url?q=http%3A%2F%2Fjexp.de%2Fblog%2F2014%2F03%2Ffull-text-indexing-fts-in-neo4j-2-0%2F&sa=D&sntz=1&usg=AFQjCNEBkKKnFpeuUlUr-vNAlw0Cx8DIdQ> > > For Neo4j 2.3 there should be a automatic solution for this issue coming > up using LIKE. > > If you have time next week you can probably drop by the office to say hi. > > Michael > > > Am 12.05.2015 um 01:00 schrieb gg4u <[email protected] <javascript:>>: > > hi folks, > i am temporarily in san francisco. > > I have my db with simple node-titles as names. > > I would like to open for a crowd-funding campaign. > > I need a final step. > Full-text search is quite lame.. It's ok if you search for the exact node > name. > But it takes several seconds to match a substring in the name. > > I thought, maybe there's a plugin to import in elastic search or alike. > River was there, but it is dismissed. > > Is there anybody which maybe willing to give an help for an importer from > neo to Elastic Search or to solve full-text indexing? > I could offer a dinner in exchange or.. well.. team up and share what I've > done if you like my project! > > a mockup is here: > www.xdiscovery.com/en/graph/22 > > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > For more options, visit https://groups.google.com/d/optout. > > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
