Scalable Scarpy environment

christoph Fri, 23 May 2014 08:03:05 -0700

Hi,

I wonder what is a good choice for an environment for a scalable scrapy 
project similar to scrapinghub?
Starting with a single vserver/root-server for crawling and data storing 
with the possibility to add additional servers when I need more scraping 
power or database space. 
According to a blog entry 
(http://blog.scrapinghub.com/2013/07/26/introducing-dash/), scrapinghub is 
using Cloudera CDH (run on which OS?) and they store their data in HBase. 
So this is a good choice?


Is there any information how to setup scrapy in a CDH environment and 
saving data into HBase?

Thank you,
Christoph

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Scalable Scarpy environment

Reply via email to