Re: FP16 Support?

2018-01-08 Thread Kohei KaiGai
Just for your information.

I tried to implement "float2" according to IEEE 754 specification,
as a custom data type of PG-Strom.

https://github.com/heterodb/pg-strom/blob/master/src/float2.c
https://github.com/heterodb/pg-strom/blob/master/sql/float2.sql

The recent GPU device (Maxwell or later) supports "half" precision
data type by the hardware, so it might be valuable for somebody.

Thanks,

2017-11-14 14:49 GMT+09:00 Kohei KaiGai :
> 2017-11-14 10:33 GMT+09:00 Thomas Munro :
>> On Tue, Nov 14, 2017 at 1:11 PM, Kohei KaiGai  wrote:
>>> Any opinions?
>>
>> The only reason I can think of for having it in core is that you might
>> want to use standard SQL notation FLOAT(10) to refer to it.  Right now
>> our parser converts that to float4 but it could map precisions up to
>> 10 to float2.  The need for such special treatment is one of my
>> arguments for considering SQL:2016 DECFLOAT(n) in core PostgreSQL.
>> But this case is different: FLOAT(10) already works, it just maps to a
>> type with a larger significand, as permitted by the standard.  So why
>> not just do these short floats as an extension type?
>>
> Our extension will be able to provide its own "half" or "float2" data type
> using CREATE TYPE, indeed. I thought it is useful to other people, even
> if they are not interested in the in-database analytics with GPU, to reduce
> amount of storage consumption.
>
> Of course, it is my opinion.
>
> Thanks,
> --
> HeteroDB, Inc / The PG-Strom Project
> KaiGai Kohei 



-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 



Re: FP16 Support?

2017-11-13 Thread Thomas Munro
On Tue, Nov 14, 2017 at 6:49 PM, Kohei KaiGai  wrote:
> 2017-11-14 10:33 GMT+09:00 Thomas Munro :
>> On Tue, Nov 14, 2017 at 1:11 PM, Kohei KaiGai  wrote:
>>> Any opinions?
>>
>> The only reason I can think of for having it in core is that you might
>> want to use standard SQL notation FLOAT(10) to refer to it.  Right now
>> our parser converts that to float4 but it could map precisions up to
>> 10 to float2.  The need for such special treatment is one of my
>> arguments for considering SQL:2016 DECFLOAT(n) in core PostgreSQL.
>> But this case is different: FLOAT(10) already works, it just maps to a
>> type with a larger significand, as permitted by the standard.  So why
>> not just do these short floats as an extension type?
>>
> Our extension will be able to provide its own "half" or "float2" data type
> using CREATE TYPE, indeed. I thought it is useful to other people, even
> if they are not interested in the in-database analytics with GPU, to reduce
> amount of storage consumption.
>
> Of course, it is my opinion.

Perhaps what the world needs is a single extension called ieee754 that
would provide binary16 AND decimal32, decimal64.  Seems a bit unlikely
to happen though, because even though that's a single standard the
people who care about binary floats and the people who care about
decimal floats go to different conferences and the other base isn't on
their radar (joking, sort of).  I also wonder if there could be some
way to make it so that the FLOAT(n) and DECFLOAT(n) typename
conversion happens somewhere outside the parser so that extensions can
participate in that kind of trick...

-- 
Thomas Munro
http://www.enterprisedb.com



Re: FP16 Support?

2017-11-13 Thread Kohei KaiGai
2017-11-14 10:33 GMT+09:00 Thomas Munro :
> On Tue, Nov 14, 2017 at 1:11 PM, Kohei KaiGai  wrote:
>> Any opinions?
>
> The only reason I can think of for having it in core is that you might
> want to use standard SQL notation FLOAT(10) to refer to it.  Right now
> our parser converts that to float4 but it could map precisions up to
> 10 to float2.  The need for such special treatment is one of my
> arguments for considering SQL:2016 DECFLOAT(n) in core PostgreSQL.
> But this case is different: FLOAT(10) already works, it just maps to a
> type with a larger significand, as permitted by the standard.  So why
> not just do these short floats as an extension type?
>
Our extension will be able to provide its own "half" or "float2" data type
using CREATE TYPE, indeed. I thought it is useful to other people, even
if they are not interested in the in-database analytics with GPU, to reduce
amount of storage consumption.

Of course, it is my opinion.

Thanks,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 



Re: FP16 Support?

2017-11-13 Thread Kohei KaiGai
2017-11-14 10:21 GMT+09:00 Tom Lane :
> Kohei KaiGai  writes:
>> How about your thought for support of half-precision floating point,
>> FP16 in short?
>
> This sounds like a whole lotta work for little if any gain.  There's not
> going to be any useful performance gain from using half-width floats
> except in an environment where it's the individual FLOPs that dominate
> your costs.  PG is not designed for that sort of high-throughput
> number-crunching, and it's not likely to get there anytime soon.
>
> When we can show real workloads where float32 ops are actually the
> dominant time sink, it would be appropriate to think about whether
> float16 is a useful solution.  I don't deny that we could get there
> someday, but I think putting in float16 now would be a fine example
> of having your priorities reversed.
>
A typical workload I expect is, a data-scientist stores more than million
records which contains feature vectors hundreds or thousands dimension.
It consumes storage space according to the width and number of items,
and it also affects to the scan performance. In addition, it may need
extra data format conversion if user's script wants to process the feature
vector as half-precision floating point.

If a record contains a feature vector with 1000 dimension by FP32,
a million records consume 4GB storage space.
In case when FP16 is sufficient, it consumes only half of space, thus,
it takes half of time to export data arrays from the database.

Of course, our own extension can define own "half" or "float2" data types
regardless of the core feature, however, I thought it is a useful feature
for other people also.

Thanks,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 



Re: FP16 Support?

2017-11-13 Thread Thomas Munro
On Tue, Nov 14, 2017 at 1:11 PM, Kohei KaiGai  wrote:
> Any opinions?

The only reason I can think of for having it in core is that you might
want to use standard SQL notation FLOAT(10) to refer to it.  Right now
our parser converts that to float4 but it could map precisions up to
10 to float2.  The need for such special treatment is one of my
arguments for considering SQL:2016 DECFLOAT(n) in core PostgreSQL.
But this case is different: FLOAT(10) already works, it just maps to a
type with a larger significand, as permitted by the standard.  So why
not just do these short floats as an extension type?

-- 
Thomas Munro
http://www.enterprisedb.com



Re: FP16 Support?

2017-11-13 Thread Andres Freund
On 2017-11-13 20:21:47 -0500, Tom Lane wrote:
> Kohei KaiGai  writes:
> > How about your thought for support of half-precision floating point,
> > FP16 in short?
> 
> This sounds like a whole lotta work for little if any gain.  There's not
> going to be any useful performance gain from using half-width floats
> except in an environment where it's the individual FLOPs that dominate
> your costs.  PG is not designed for that sort of high-throughput
> number-crunching, and it's not likely to get there anytime soon.
> 
> When we can show real workloads where float32 ops are actually the
> dominant time sink, it would be appropriate to think about whether
> float16 is a useful solution.  I don't deny that we could get there
> someday, but I think putting in float16 now would be a fine example
> of having your priorities reversed.

Agree that there's no performance argument. I think you could kinda
sorta make an argument for higher storage density in cases where a lot
of floats are stored in the database.  I'd personally still consider
that not worthwhile to invest time in, but ...

Greetings,

Andres Freund



Re: FP16 Support?

2017-11-13 Thread Tom Lane
Kohei KaiGai  writes:
> How about your thought for support of half-precision floating point,
> FP16 in short?

This sounds like a whole lotta work for little if any gain.  There's not
going to be any useful performance gain from using half-width floats
except in an environment where it's the individual FLOPs that dominate
your costs.  PG is not designed for that sort of high-throughput
number-crunching, and it's not likely to get there anytime soon.

When we can show real workloads where float32 ops are actually the
dominant time sink, it would be appropriate to think about whether
float16 is a useful solution.  I don't deny that we could get there
someday, but I think putting in float16 now would be a fine example
of having your priorities reversed.

regards, tom lane