I'm sure there are some bottlenecks that could be eliminated in camput, it needs some cleanup/refactoring anyway. However, camput does a lot of different things anyway, which explains some of its complexity.
I'm wondering, now that we have a pretty good grip on writing importers, maybe we could write a "files importer" that would be more streamlined and more efficient than camput for large filesets. It could even watch a directory and push files to camlistore as they get added to the directory. Brad, WDYT? On 31 May 2016 at 07:52, Alok Parlikar <[email protected]> wrote: > Accidentally clicked POST. Editing here and Reposting. > > As I am feeding data into my new camlistore, things are running quite > slowly. I must be doing something wrong. Might be the combination of > fsync+usb3+ext4. But perhaps someone has a quick tip to help me go faster :) > > I am still waiting to receive new hardware which will be my primary > camlistore (which will have ZFS RAID). Until then, I was hoping to ingest > data and create blobs that I could dump later. So, currently, I have this: > > Data size: About 2 TB. Mostly old backups, of media files, code and > documents. > > Running camlistored and camput on: 64-bin Ubuntu laptop; intel i5-4200M @ > 2.5GHz; 8GB RAM. > > Reading Blobs From: USB3.0 interface to an ext4 backup disk. > Writing Blobs To: USB3.0 interface to a different disk. (Which will later be > copied off to the new hardware). > Leveldb Cache: on my laptop's SSD. > > (i) camput seemed to be moving much slower than disk speeds, even > consireding the USB3 interface. I'm seeing maybe about 1MBps going into the > blobstore. "strace" showed that fsync was a frequent syscall. > > (ii) I tried this: > > camput file -permanode somedir > <Took overnight for 50GB> > > And then again for the same content > camput file somedir > <Taking a few hours again, and reporting duplicates for the same 50GB> > > > So I have two questions: > > (i) Does anyone have experience with disabling fsync with blobpacked? I see > two Flush() calls on zipwriter. Not sure if that will work though. > (ii) What else can I do to speed up camput? Perhaps -- run multiple camputs > in parallel on different directories? > > If blobs get created at ~1MBps, 2TB will take a looong time :-) > > > > > > -- > You received this message because you are subscribed to the Google Groups > "Camlistore" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Camlistore" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
