I don't have paging in yet... I've been slacking I know. I'll get to it soon.
On Fri, Jul 1, 2011 at 7:27 AM, Michael Hunger <[email protected]> wrote: > You have a few options here: > > * paging is right now only supported in the REST API for traversals the other > request types will get it in 1.5 (so you could use the paging functionality > for your traverser, don't know if Max supports that already in neography) > * you could use either the cypher > (http://docs.neo4j.org/chunked/snapshot/cypher-plugin.html) or the gremlin > plugin (http://docs.neo4j.org/chunked/snapshot/gremlin-plugin.html) to write > a request against their "execute_script" api which then does the index lookup > + limit > > * you can write your own server plugin doing that > http://docs.neo4j.org/chunked/snapshot/server-plugins.html > > Cheers > > Michael > > Am 01.07.2011 um 14:04 schrieb Laurent Laborde: > >> the ruby crash when i request all the page with parsed = false >> using directly the REST interface with CURL : the result is a huge >> json with ~10.000 nodes >> >> is there a way to limit the result size, like a SQL "SELECT * from >> node where parsed == 'true' limit 100;" ? >> i tried using a traverser instead of requesting the index : >> >> node_to_parse = @neo.traverse(ob_root_node, "nodes", { "relationships" >> => [{"type"=> "link", "direction" => "out" }], "prune evaluator" => >> {"language" => "javascript", "body" => >> "position.endNode().getProperty('parsed') == 'false';"}, "return >> filter" => {"language" => "builtin", "name" => "all but start >> node"}}) >> >> >> /home/ker2x/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/net/protocol.rb:140:in >> `rescue in rbuf_fill': Timeout::Error (Timeout::Error) >> from >> /home/ker2x/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/net/protocol.rb:134:in >> `rbuf_fill' >> from >> /home/ker2x/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/net/protocol.rb:116:in >> `readuntil' >> from >> /home/ker2x/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/net/protocol.rb:126:in >> `readline' >> from >> /home/ker2x/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/net/http.rb:2219:in >> `read_status_line' >> from >> /home/ker2x/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/net/http.rb:2208:in >> `read_new' >> from >> /home/ker2x/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/net/http.rb:1191:in >> `transport_request' >> from >> /home/ker2x/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/net/http.rb:1177:in >> `request' >> from >> /home/ker2x/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/net/http.rb:1170:in >> `block in request' >> from >> /home/ker2x/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/net/http.rb:627:in >> `start' >> from >> /home/ker2x/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/net/http.rb:1168:in >> `request' >> from >> /home/ker2x/.rvm/gems/ruby-1.9.2-p180/gems/httparty-0.7.8/lib/httparty/request.rb:69:in >> `perform' >> from >> /home/ker2x/.rvm/gems/ruby-1.9.2-p180/gems/httparty-0.7.8/lib/httparty.rb:390:in >> `perform_request' >> from >> /home/ker2x/.rvm/gems/ruby-1.9.2-p180/gems/httparty-0.7.8/lib/httparty.rb:358:in >> `post' >> from >> /home/ker2x/.rvm/gems/ruby-1.9.2-p180/gems/httparty-0.7.8/lib/httparty.rb:426:in >> `post' >> from >> /home/ker2x/.rvm/gems/ruby-1.9.2-p180/gems/neography-0.0.13/lib/neography/rest.rb:363:in >> `post' >> from >> /home/ker2x/.rvm/gems/ruby-1.9.2-p180/gems/neography-0.0.13/lib/neography/rest.rb:317:in >> `traverse' >> from nokogiri-test.rb:26:in `<main>' >> >> >> -- >> Keru >> >> On Fri, Jul 1, 2011 at 1:15 PM, Michael Hunger >> <[email protected]> wrote: >>> Can you call the index REST request manually and see what it returns? >>> >>> see here >>> http://components.neo4j.org/neo4j-server/snapshot/rest.html#Index_search_-_Exact_keyvalue_lookup >>> >>> curl -H Accept:application/json >>> http://localhost:7474/db/data/index/node/my_nodes/the_key/the_value%20with%20space >>> >>> see here: >>> http://stackoverflow.com/questions/547127/in-ruby-how-do-i-replace-the-question-mark-character-in-a-string >>> >>> require "addressable/uri" >>> Addressable::URI.encode_component("http://test.com?test/test%test",Addressable::URI::CharacterClasses::PATH) >>> >>> Cheers >>> >>> Michael >>> >>> >>> Am 01.07.2011 um 11:23 schrieb Laurent Laborde: >>> >>>> After a few run (and more and more and more page to crawl) it seems >>>> that the result returned by the index is too big : >>>> >>>> /home/ker2x/.rvm/gems/ruby-1.9.2-p180/gems/crack-0.1.8/lib/crack/json.rb:54: >>>> stack level too deep (SystemStackError) >>>> >>>> Any idea ? workaround ? >>>> >>>> thank you >>>> >>>> -- >>>> Ker2x >>>> >>>> On Fri, Jul 1, 2011 at 10:44 AM, Laurent Laborde <[email protected]> >>>> wrote: >>>>> I used Base64.encode64 instead ! it still didn't worked. >>>>> So i used Base64.encode64 and get_node_index instead of >>>>> find_node_index and it worked \o/ >>>>> >>>>> -- >>>>> Ker2x >>>>> >>>>> On Fri, Jul 1, 2011 at 10:25 AM, Laurent Laborde <[email protected]> >>>>> wrote: >>>>>> thank you for your help. >>>>>> as you probably noticed i'm not good with ruby (i'm a sysadmin ^^) >>>>>> >>>>>> i tried using URI.encode but it doesn't works as expected. >>>>>> >>>>>> irb(main):001:0> require 'uri' >>>>>> => true >>>>>> irb(main):002:0> puts URI.escape("http://www.over.blog.com/") >>>>>> http://www.over.blog.com/ >>>>>> => nil >>>>>> irb(main):003:0> puts URI.encode("http://www.over.blog.com/") >>>>>> http://www.over.blog.com/ >>>>>> => nil >>>>>> >>>>>> i guess that the output sould be more like >>>>>> "http%3A%2F%2Fwww.over-blog.com%2" isn't it ? >>>>>> >>>>>> -- >>>>>> Ker2x >>>>>> >>>>>> On Thu, Jun 30, 2011 at 6:40 PM, Michael Hunger >>>>>> <[email protected]> wrote: >>>>>>> you have to escape the url index value >>>>>>> otherwise the jersey rest framework consumes it silently. I had this >>>>>>> problem when working on the birdies demo app. Took me a while to work >>>>>>> that out. >>>>>>> >>>>>>> see http://github.com/jexp/birdies >>>>>>> and http://birdies.heroku.com >>>>>>> >>>>>>> Michael >>>>>>> >>>>>>> Sent from my iBrick4 >>>>>>> >>>>>>> >>>>>>> Am 30.06.2011 um 17:43 schrieb Laurent Laborde <[email protected]>: >>>>>>> >>>>>>>> Friendly greetings ! >>>>>>>> i'm on the same problem since many days (an hour per day) and i can't >>>>>>>> find a solution >>>>>>>> i have 2 index (see source doe below) >>>>>>>> No problem with the "parsed" index, but the "url" index never return >>>>>>>> any result. >>>>>>>> I don't if it's because the url isn't indexed or because the query on >>>>>>>> the index is wrong. >>>>>>>> Or something else ? >>>>>>>> >>>>>>>> Could you please take a look and see what's wrong ? >>>>>>>> thank you >>>>>>>> >>>>>>>> (you can try to run the script, it works) >>>>>>>> >>>>>>>> require 'nokogiri' >>>>>>>> require 'open-uri' >>>>>>>> require 'neography' >>>>>>>> >>>>>>>> #init neography >>>>>>>> @neo = Neography::Rest.new >>>>>>>> neo_root = @neo.get_root >>>>>>>> >>>>>>>> domaine = 'http://www.over-blog.com/' >>>>>>>> parsed_idx = "ob_parsed_idx" >>>>>>>> url_idx = "ob_url_idx" >>>>>>>> >>>>>>>> #FIRST RUN >>>>>>>> #ob_root_node = @neo.create_node("domaine" => domaine, "parsed" => >>>>>>>> "false", "url" => domaine) >>>>>>>> #@neo.create_relationship("obgraph", neo_root, ob_root_node) >>>>>>>> #pidx = @neo.create_node_index(parsed_idx) >>>>>>>> #uidx = @neo.create_node_index(url_idx) >>>>>>>> #@neo.add_node_to_index(parsed_idx, "parsed", "false", ob_root_node) >>>>>>>> ##@neo.add_node_to_index(url_idx, "url", domaine, ob_root_node) >>>>>>>> #node_to_parse = @neo.get_node_index(parsed_idx, "parsed", "false") >>>>>>>> >>>>>>>> ob_root_node = @neo.traverse(neo_root, "nodes", { "relationships" => >>>>>>>> [{"type"=> "obgraph", "direction" => "out" }], "depth" => 1}) >>>>>>>> #node_to_parse = @neo.traverse(ob_root_node, "nodes", { >>>>>>>> "relationships" => [{"type"=> "link", "direction" => "out" }] }) >>>>>>>> node_to_parse = @neo.get_node_index(parsed_idx, "parsed", "false") >>>>>>>> >>>>>>>> #print @neo.list_node_indexes >>>>>>>> >>>>>>>> node_to_parse.each do |node| >>>>>>>> >>>>>>>> url_to_parse = @neo.get_node_properties(node)["url"] >>>>>>>> printf("exploring : %s\n", url_to_parse) >>>>>>>> >>>>>>>> doc = Nokogiri::HTML(open(url_to_parse)) >>>>>>>> @neo.set_node_properties(node, {"parsed" => "true"}) >>>>>>>> @neo.remove_node_from_index(parsed_idx, node) >>>>>>>> @neo.add_node_to_index(parsed_idx, "parsed", "true", node) >>>>>>>> >>>>>>>> doc.xpath('//a').each do |link| >>>>>>>> >>>>>>>> link_text = link.content.strip() >>>>>>>> link_url = link['href'].to_s().strip() >>>>>>>> link_title = link['title'].to_s().strip() >>>>>>>> >>>>>>>> link_url = link_url.sub(/#.*$/, "") >>>>>>>> >>>>>>>> if(link_url =~ /^\/.*/) >>>>>>>> link_url = link_url.sub(/^\//, '') >>>>>>>> link_url = domaine + link_url >>>>>>>> end >>>>>>>> >>>>>>>> if(link_text == '') >>>>>>>> link_text = link_title >>>>>>>> end >>>>>>>> >>>>>>>> >>>>>>>> #skiping empty stuff >>>>>>>> next if link_url.empty? >>>>>>>> next if link_text.empty? >>>>>>>> >>>>>>>> node_found = @neo.find_node_index(url_idx, "url", link_url) >>>>>>>> #node_found = @neo.traverse(ob_root_node, "nodes", { >>>>>>>> "relationships" => [{"direction" => "out" }], "prune evaluator" => >>>>>>>> {"language" => "javascript", "body" => >>>>>>>> "position.endNode().getProperty(url) == #{link_url};"}, "return >>>>>>>> filter" => {"language" => "builtin", "name" => "all but start >>>>>>>> node"}}) >>>>>>>> print "\nsearching url #{link_url}\n" >>>>>>>> printf("node_found : %s \n", node_found) >>>>>>>> if(node_found.nil?) >>>>>>>> printf("create node %s\n", link_url) >>>>>>>> nnode = @neo.create_node("parsed" => "false", "url" => >>>>>>>> link_url) >>>>>>>> @neo.add_node_to_index(url_idx, "url", link_url, nnode) >>>>>>>> @neo.add_node_to_index(parsed_idx, "parsed", "false", nnode) >>>>>>>> else >>>>>>>> printf("node_found : %s \n", node_found) >>>>>>>> end >>>>>>>> >>>>>>>> >>>>>>>> nrel = @neo.create_relationship("link", node, nnode) >>>>>>>> @neo.set_relationship_properties(nrel, {"text" => link_text}) >>>>>>>> >>>>>>>> #printf("%s => %s\n", link_text, link_url) >>>>>>>> >>>>>>>> end >>>>>>>> >>>>>>>> sleep(1.0) >>>>>>>> >>>>>>>> >>>>>>>> end >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Laurent "ker2x" Laborde >>>>>>>> Sysadmin & DBA at http://www.over-blog.com/ >>>>>>>> _______________________________________________ >>>>>>>> Neo4j mailing list >>>>>>>> [email protected] >>>>>>>> https://lists.neo4j.org/mailman/listinfo/user >>>>>>> _______________________________________________ >>>>>>> Neo4j mailing list >>>>>>> [email protected] >>>>>>> https://lists.neo4j.org/mailman/listinfo/user >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Laurent "ker2x" Laborde >>>>>> Sysadmin & DBA at http://www.over-blog.com/ >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Laurent "ker2x" Laborde >>>>> Sysadmin & DBA at http://www.over-blog.com/ >>>>> >>>> >>>> >>>> >>>> -- >>>> Laurent "ker2x" Laborde >>>> Sysadmin & DBA at http://www.over-blog.com/ >>>> _______________________________________________ >>>> Neo4j mailing list >>>> [email protected] >>>> https://lists.neo4j.org/mailman/listinfo/user >>> >>> _______________________________________________ >>> Neo4j mailing list >>> [email protected] >>> https://lists.neo4j.org/mailman/listinfo/user >>> >> >> >> >> -- >> Laurent "ker2x" Laborde >> Sysadmin & DBA at http://www.over-blog.com/ >> _______________________________________________ >> Neo4j mailing list >> [email protected] >> https://lists.neo4j.org/mailman/listinfo/user > > _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

