Re: [Firebird-devel] The Power of C++11 in CUDA 7

James Starkey Fri, 20 Mar 2015 13:46:28 -0700

I think it would be extremely difficult to implement both fine grain
multi-threading and co-processor exploitation in a shared meta-data
implementation.  If Firebird were rearchitected so that each connection had
a dedicated metadata cache (and the caches with a mechanism to propagate
metadata changes to all threads, then it would probably be feasible but
would also incur all sorts of additional overheads.

Also, keep in mind that if you use a library that makes coprocessor usage
transparent, then in the 99.99% of installations that don't have
coprocessors installed, you're adding major overhead with no possible gain.

There's an important lesson to be learned from the Sun 350 Workstation.
The 350 was designed to use a bit-blit chip from Carver Meade's Silicon
Compilers.  The chip worked as spec'd, but when they plugged it in, the
graphics went slower.  Turned out that although the chip was great for
large transfers, most operations were small, and the set up cost for the
operation was greater than operation in software.  So they added a test to
distinguish between small and large operations.  Then the graphics went
even slower as the more common short operations became even slower with the
addition of the test.

I'm deeply cynical about hardware acceleration for database systems.
Dozens and dozens have been tried and nothing has succeeded.  There are at
least two or three dead companies that tried to accelerate MySQL.  I'm not
saying it can't be done, just that nobody has ever succeeded on relational
database type workloads.

Personally, I consider it folly.  It is some much better to use large
numbers of very cheap commodity servers than a single very expensive (and
rare) systems.  NuoDB running on a bunch of cheap processors will blow the
doors off a single fast system with high end coprocessors for throughput.
The last numbers I heard was something like 1.5 million transactions per
second running on a hundred servers.  Your milage may vary.

I think it would be vastly better for Firebird to address operating across
cheap commodity servers than to optimize for exotic -- and hyper-expensive
-- servers.

On Friday, March 20, 2015, Leyne, Sean <s...@broadviewsoftware.com> wrote:

> Jim,
>
> > The problem with specialized processors is that they are a scarce
> > resource that must be managed rather than shared. They're just dandy
> > when a server has a single specialized load, but on a server with
> > multiple clients, one guy gets the specialized processor and everyone
> one else waits.
> >
> > The best way for Firebird to do parallism is to move towards fine
> > grain multi- threading so at least all available processors can keep
> > busy doing useful work.  Trying to accelerate a single operation with
> > non-shared hardware is almost always a net loss.
>
> To my way of thinking/understanding the current OpenCL/GPU/PHI solutions
> can be used as a shared resource.
>
> To be clear, I am proposing that the development focus should be on
> refactoring the existing code with the view of improving multi-threading
> support, first, and then parallelism -- thru libraries like the Intel C++
> Compiler or Intel Parallel Studio which can automatically detect and use
> available hardware (CPU/threads, MPI, GPU or coprocessor) to the greatest
> extent possible.
>
> Granted for some of the solutions, some resource management/queuing may be
> required, but the overall performance payoff could be worth it.
>
> In the case of the PHI, having up to 61* "helper" processors which could
> be responsible for performing sorting/grouping for *any* running query (so
> a shared resource) would provide significant benefit.  In the case of the
> PHI, each processor is a full x86 instance, so unlike GPU based solutions
> no specialized instructions set would be required.
>
> * 57 to 61 cores (@1.053 to 1.238Ghz) per PHI card, a server could have
> several cards -- up to 8 cards per server.
>
>
> > You might remember that the original DEC JRD was the core for what was
> > eventually be the DEC database machine.  We had people on specialized
> > hardware and specialized microcode.
>
> Granted I am now an old man with a 20yr old son, but I wasn't around in
> the Neolithic era!  ;-]
>
>
> > The Falcon group at MySQL had a long meeting with the Intel
> > parallelization tool group.
>
> My I ask, when were those discussions?
>
> Was the XEON PHI discussed?
>
>
> > At the end of the day, everyone was in agreement that with more
> > processes than cores, fine grain multi-threading using user mode
> > interlocked instructions for thread synchronization was by far the
> > best solution.
>
> Again, there are some operations (ie. Sorting/merging/grouping) where
> interlocked instructions are not necessary.
>
>
> Sean
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for
> all
> things parallel software development, from weekly thought leadership blogs
> to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
> Firebird-Devel mailing list, web interface at
> https://lists.sourceforge.net/lists/listinfo/firebird-devel
>

-- 
Jim Starkey

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/

Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Re: [Firebird-devel] The Power of C++11 in CUDA 7

Reply via email to