Re: FP16 Support?
Just for your information. I tried to implement "float2" according to IEEE 754 specification, as a custom data type of PG-Strom. https://github.com/heterodb/pg-strom/blob/master/src/float2.c https://github.com/heterodb/pg-strom/blob/master/sql/float2.sql The recent GPU device (Maxwell or later) supports "half" precision data type by the hardware, so it might be valuable for somebody. Thanks, 2017-11-14 14:49 GMT+09:00 Kohei KaiGai: > 2017-11-14 10:33 GMT+09:00 Thomas Munro : >> On Tue, Nov 14, 2017 at 1:11 PM, Kohei KaiGai wrote: >>> Any opinions? >> >> The only reason I can think of for having it in core is that you might >> want to use standard SQL notation FLOAT(10) to refer to it. Right now >> our parser converts that to float4 but it could map precisions up to >> 10 to float2. The need for such special treatment is one of my >> arguments for considering SQL:2016 DECFLOAT(n) in core PostgreSQL. >> But this case is different: FLOAT(10) already works, it just maps to a >> type with a larger significand, as permitted by the standard. So why >> not just do these short floats as an extension type? >> > Our extension will be able to provide its own "half" or "float2" data type > using CREATE TYPE, indeed. I thought it is useful to other people, even > if they are not interested in the in-database analytics with GPU, to reduce > amount of storage consumption. > > Of course, it is my opinion. > > Thanks, > -- > HeteroDB, Inc / The PG-Strom Project > KaiGai Kohei -- HeteroDB, Inc / The PG-Strom Project KaiGai Kohei
Re: FP16 Support?
On Tue, Nov 14, 2017 at 6:49 PM, Kohei KaiGaiwrote: > 2017-11-14 10:33 GMT+09:00 Thomas Munro : >> On Tue, Nov 14, 2017 at 1:11 PM, Kohei KaiGai wrote: >>> Any opinions? >> >> The only reason I can think of for having it in core is that you might >> want to use standard SQL notation FLOAT(10) to refer to it. Right now >> our parser converts that to float4 but it could map precisions up to >> 10 to float2. The need for such special treatment is one of my >> arguments for considering SQL:2016 DECFLOAT(n) in core PostgreSQL. >> But this case is different: FLOAT(10) already works, it just maps to a >> type with a larger significand, as permitted by the standard. So why >> not just do these short floats as an extension type? >> > Our extension will be able to provide its own "half" or "float2" data type > using CREATE TYPE, indeed. I thought it is useful to other people, even > if they are not interested in the in-database analytics with GPU, to reduce > amount of storage consumption. > > Of course, it is my opinion. Perhaps what the world needs is a single extension called ieee754 that would provide binary16 AND decimal32, decimal64. Seems a bit unlikely to happen though, because even though that's a single standard the people who care about binary floats and the people who care about decimal floats go to different conferences and the other base isn't on their radar (joking, sort of). I also wonder if there could be some way to make it so that the FLOAT(n) and DECFLOAT(n) typename conversion happens somewhere outside the parser so that extensions can participate in that kind of trick... -- Thomas Munro http://www.enterprisedb.com
Re: FP16 Support?
2017-11-14 10:33 GMT+09:00 Thomas Munro: > On Tue, Nov 14, 2017 at 1:11 PM, Kohei KaiGai wrote: >> Any opinions? > > The only reason I can think of for having it in core is that you might > want to use standard SQL notation FLOAT(10) to refer to it. Right now > our parser converts that to float4 but it could map precisions up to > 10 to float2. The need for such special treatment is one of my > arguments for considering SQL:2016 DECFLOAT(n) in core PostgreSQL. > But this case is different: FLOAT(10) already works, it just maps to a > type with a larger significand, as permitted by the standard. So why > not just do these short floats as an extension type? > Our extension will be able to provide its own "half" or "float2" data type using CREATE TYPE, indeed. I thought it is useful to other people, even if they are not interested in the in-database analytics with GPU, to reduce amount of storage consumption. Of course, it is my opinion. Thanks, -- HeteroDB, Inc / The PG-Strom Project KaiGai Kohei
Re: FP16 Support?
2017-11-14 10:21 GMT+09:00 Tom Lane: > Kohei KaiGai writes: >> How about your thought for support of half-precision floating point, >> FP16 in short? > > This sounds like a whole lotta work for little if any gain. There's not > going to be any useful performance gain from using half-width floats > except in an environment where it's the individual FLOPs that dominate > your costs. PG is not designed for that sort of high-throughput > number-crunching, and it's not likely to get there anytime soon. > > When we can show real workloads where float32 ops are actually the > dominant time sink, it would be appropriate to think about whether > float16 is a useful solution. I don't deny that we could get there > someday, but I think putting in float16 now would be a fine example > of having your priorities reversed. > A typical workload I expect is, a data-scientist stores more than million records which contains feature vectors hundreds or thousands dimension. It consumes storage space according to the width and number of items, and it also affects to the scan performance. In addition, it may need extra data format conversion if user's script wants to process the feature vector as half-precision floating point. If a record contains a feature vector with 1000 dimension by FP32, a million records consume 4GB storage space. In case when FP16 is sufficient, it consumes only half of space, thus, it takes half of time to export data arrays from the database. Of course, our own extension can define own "half" or "float2" data types regardless of the core feature, however, I thought it is a useful feature for other people also. Thanks, -- HeteroDB, Inc / The PG-Strom Project KaiGai Kohei
Re: FP16 Support?
On Tue, Nov 14, 2017 at 1:11 PM, Kohei KaiGaiwrote: > Any opinions? The only reason I can think of for having it in core is that you might want to use standard SQL notation FLOAT(10) to refer to it. Right now our parser converts that to float4 but it could map precisions up to 10 to float2. The need for such special treatment is one of my arguments for considering SQL:2016 DECFLOAT(n) in core PostgreSQL. But this case is different: FLOAT(10) already works, it just maps to a type with a larger significand, as permitted by the standard. So why not just do these short floats as an extension type? -- Thomas Munro http://www.enterprisedb.com
Re: FP16 Support?
On 2017-11-13 20:21:47 -0500, Tom Lane wrote: > Kohei KaiGaiwrites: > > How about your thought for support of half-precision floating point, > > FP16 in short? > > This sounds like a whole lotta work for little if any gain. There's not > going to be any useful performance gain from using half-width floats > except in an environment where it's the individual FLOPs that dominate > your costs. PG is not designed for that sort of high-throughput > number-crunching, and it's not likely to get there anytime soon. > > When we can show real workloads where float32 ops are actually the > dominant time sink, it would be appropriate to think about whether > float16 is a useful solution. I don't deny that we could get there > someday, but I think putting in float16 now would be a fine example > of having your priorities reversed. Agree that there's no performance argument. I think you could kinda sorta make an argument for higher storage density in cases where a lot of floats are stored in the database. I'd personally still consider that not worthwhile to invest time in, but ... Greetings, Andres Freund
Re: FP16 Support?
Kohei KaiGaiwrites: > How about your thought for support of half-precision floating point, > FP16 in short? This sounds like a whole lotta work for little if any gain. There's not going to be any useful performance gain from using half-width floats except in an environment where it's the individual FLOPs that dominate your costs. PG is not designed for that sort of high-throughput number-crunching, and it's not likely to get there anytime soon. When we can show real workloads where float32 ops are actually the dominant time sink, it would be appropriate to think about whether float16 is a useful solution. I don't deny that we could get there someday, but I think putting in float16 now would be a fine example of having your priorities reversed. regards, tom lane