Some time ago I solved similar problem (but I needed continuous grabbing), organizing several workers: https://medium.com/@vladimir_vg/dsl-74d0fcf03cae (in Russian language) Probably you do not need such a complex thing, but you may get some ideas from it.
On Tue, May 12, 2015 at 7:42 AM, Роман Ярыгин <[email protected]> wrote: > Hello! > > I need to grab all site data with all tree structure. Every page have > links to children pages. How to build site tree with Nokogiri? It must be > recursive page visiting and scraping all directory links, but I can't > recognize full algorhytm. How to do that? > P.S. And I don't need to "Save all site on disk with HTTRack". Data will > be processed and copied on the new version of redesigned original site. > > -- > You received this message because you are subscribed to the Google Groups > "Ruby on Rails: Talk" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/rubyonrails-talk/db39c272-d353-42be-ae09-4a09fcf4abca%40googlegroups.com > <https://groups.google.com/d/msgid/rubyonrails-talk/db39c272-d353-42be-ae09-4a09fcf4abca%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/rubyonrails-talk/CAP1h_xfgY-B9SVoxaneAi7SofpT43vXAq9Jqz3s2eDtxva7c6Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

