Okay, that's pretty much the sensible answer I expected: "YMMV, try some 
experiments and let's work with hard data."

I need to reread Andy's post [1] about the packing algorithm to make sure I 
understand what's going on there.

The hardware in question is definitely "$$$"-- maybe even "$$$$". {grin} This 
is contract work, so before going too much further I need to check with my 
principals to prioritize this kind of speculative work and we will make a plan 
accordingly.

I'm not sure immediately how to measure the rates in terms of tuples/sec-- I'm 
not aware of any such logging from sort itself, but I haven't looked for it. I 
assume that tdbloader2 can emit messages as it packs tuples in the same way 
that it emits messages with an average rate periodically in the 'data' phase?

---
A. Soroka
The University of Virginia Library

[1] https://seaborne.blogspot.com/2010/12/repacking-btrees.html

> On Oct 28, 2016, at 11:12 AM, Andy Seaborne <[email protected]> wrote:
> 
> 
> 
> On 28/10/16 15:58, A. Soroka wrote:
>> That's right-- if I understand it correctly, there are two steps--
>> POSIX sort to develop the index orderings, and then packing the
>> actual index files.
>> 
>> For the POSIX sort step, it's certainly true that more parallelism
>> than needed would be a bad thing. With Andy's help I just made a
>> commit that allows a little more control using the common --parallel
>> flag for sort. But the current ergonomics seem suboptimal. E.g. with
>> the current settings indexing a 300Mt dataset on a 24-core box with
>> fast storage, I saw only one core in full use, and very little IO
>> usage. Aliasing in some parallelism via the sort flag brought several
>> more cores into play and cut the time spent by two-thirds.
> 
> What rates are you getting?
> 
>> I don't
>> know how normal that is, but for the sort step, my argument is not
>> that we could find universally better ergonomics, but that we could
>> bake some flexibility in for those who want to try adjustments on
>> their particular hardware, including the ability to try running
>> multiple sorts at one time.
> 
> That sounds like an interesting experiment to carry out and if successful 
> change the released code.
> 
>> 
>> For the other step, I don't feel like I understand the index-packing
>> code well enough yet to form an opinion, which is one reason for the
>> question. It seems that it could run in parallel without difficulty,
>> but maybe I don't understand the relationships between the indexes
>> well enough.
> 
> Index packing is I/O bound and is sequential. There is little computation 
> going on.
> 
> Doing two packings in parallel would break up the sequential write sequence 
> so there would be need to be a noticable gain in some way to compensate for 
> the impact.
> 
> Bus contention when it's a SSD may come into play.  The quality/speed of the 
> connection to the SSD is related to how much $$$ the server cost!
> 
>> Another question then would be: maybe we could split the current
>> 'index' phase into 'order' and 'pack' phases, again for those who
>> would like to try tuning each step for their situation?
> 
> Interesting possibility - needs trying out and bedding down before it goes 
> into the standard release scripts IMO.  What works well in one environment 
> may not in another.  Lots of options suits some people and not others.
> 
>    Andy
> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>>> On Oct 28, 2016, at 10:24 AM, Rob Vesse <[email protected]> wrote:
>>> 
>>> If memory serves those are the phases that use POSIX sort right?
>>> 
>>> Sort will try and do an in-memory sort as far as possible and fall back to 
>>> a disk-based merge sort if not. Also we usually configure sort to run in 
>>> parallel
>>> 
>>> If you try to process different indexing in parallel you would create a lot 
>>> of memory and disk contention which would likely slowdown overall 
>>> performance
>>> 
>>> For sufficiently large data sets there is also a risk of exhausting disk 
>>> space during the sort phase and building multiple indexes in parallel would 
>>> only exacerbate this
>>> 
>>> Rob
>>> 
>>> On 28/10/2016 14:33, "A. Soroka" <[email protected]> wrote:
>>> 
>>>   I'm still learning about tdbloader2 and have another question about the 
>>> index phase: is there any reason why the processes for the various index 
>>> orderings (SPO, GSPO, etc.) couldn't go on in parallel? Or am I missing 
>>> some switch or setting that already allows that?
>>> 
>>>   ---
>>>   A. Soroka
>>>   The University of Virginia Library
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 

Reply via email to