The index file is 1.3GB in the +Bag case, 2GB in the +String case, doesn't seem like a big deal to me given that the main entity file ends up being 32GB.
Now I haven't checked, but due to the relative size of the files the range query might be comparably faster but in my case a tenth of a second here and there won't matter. On Tue, Feb 11, 2014 at 2:54 AM, Joe Bogner <joebog...@gmail.com> wrote: > Hey Alex - > > On Mon, Feb 10, 2014 at 9:31 AM, Alexander Burger <a...@software-lab.de>wrote: > >> Also, you can save quite some time if you pre-allocate memory, to avoid >> an increase with each garbage collection. I would call (gc 800) in the >> beginning, to allocate 800 MB, and (gc 0) in the end. >> > > > Thanks for the reminder about gc I remember you mentioning it over a year > ago: https://email@example.com/msg03308.html. > I added the gc and completed 30 days of import in two minutes. I also > switched to my i7 (under cygwin too) vs my xen virtual host. It ended up > using 2.7 gig of disk so I had to stop it. Again, I'm reminded and > impressed with the speed. > > Thanks > Joe >