Re: Extending FilesPipeline with a custom store scheme

Lhassan Baazzi Thu, 18 Aug 2016 05:50:33 -0700

Hi,

If you are going with the zip option, just create you own pipeline that
extend the base file pipeline and publish it as a package on Github, if
someone else needed and use it.


Best Regards.
Lhassan
Le 18 août 2016 13:16, "Kasper Marstal" <kaspermars...@gmail.com> a écrit :

> Hi all,
>
> I am scraping a couple of million documents and need to save space on my
> disk to store the data. An attractive option is to save the files directly
> to a ZIP file since the compression ratio is really good with this kind of
> data (~18). However, the FilesPipeline does not allow me to provide my own
> files store, unless I hack away on the scrapy code itself, which I would
> like to avoid. So, a couple of questions for the scrapy developers:
>
> - Are you interested in a patch that allows the FilesPipeline to accept
> custom store schemes? OR
> - Are you interested in a patch with a ZipFilesStore? In addition,
> - Is this ZIP-file approach a common way of dealing with large amounts of
> data, or do you have best-practices on this subject that I am not aware of?
>
> Kind Regards,
> Kasper Marstal
>
> --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to scrapy-users+unsubscr...@googlegroups.com.
> To post to this group, send email to scrapy-users@googlegroups.com.
> Visit this group at https://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: Extending FilesPipeline with a custom store scheme

Reply via email to