This patch adds the initial support for the 16-bit floating point formats.
_Float16 is the IEEE 754 half precision format.  __bfloat16 is the Google Brain
16-bit format.

In order to use both _Float16 and __bfloat16, the user has to use the -mfloat16
option to enable the support.

In this patch only the machine indepndent support is used.  In order to be
usable, the next patch will also need to be installed. That patch will add
support in libgcc for 16-bit floating point support.

All 11 patches have been tested on little endian and big endian PowerPC
servers with no regressions.  Can I check in these patches?

2025-11-14  Michael Meissner  <[email protected]>

gcc/

        * config/rs6000/constraints.md (eZ): New constraint for -0.0.
        * config/rs6000/float16.md: New file to add basic 16-bit floating point
        support.
        * config/rs6000/predicates.md (easy_fp_constant): Add support for HFmode
        and BFmode constants.
        (easy_vector_constant): Add support for V8HFmode and V8BFmode to load up
        the vector -0.0 constant.
        (minus_zero_constant): New predicate.
        (fp16_xxspltiw_constant): Likewise.
        * config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add support for
        16-bit floating point types.
        (rs6000_init_builtins): Create the bfloat16_type_node if needed.
        * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
        __FLOAT16__ and __BFLOAT16__ if 16-bit floating pont is enabled.
        * config/rs6000/rs6000-call.cc (init_cumulative_args): Warn if a
        function returns a 16-bit floating point value unless -Wno-psabi is
        used.
        (rs6000_function_arg): Warn if a 16-bit floating point value is passed
        to a function unless -Wno-psabi is ued.
        * config/rs6000/rs6000-protos.h (vec_const_128bit_type): Add mode field
        to detect initializing 16-bit floating constants.
        * config/rs6000/rs6000.cc (rs6000_hard_regno_mode_ok_uncached): Add
        support for 16-bit floating point.
        (rs6000_modes_tieable_p): Don't allow 16-bit floating point modes to tie
        with other modes.
        (rs6000_debug_reg_global): Add BFmode and HFmode.
        (rs6000_setup_reg_addr_masks): Add support for 16-bit floating point
        types.
        (rs6000_setup_reg_addr_masks): Likewise.
        (rs6000_init_hard_regno_mode_ok): Likewise.
        (rs6000_option_override_internal): Add a check whether -mfloat16 can be
        used.
        (easy_altivec_constant): Add suport for 16-bit floating point.
        (xxspltib_constant_p): Likewise.
        (rs6000_expand_vector_init): Likewise.
        (rs6000_expand_vector_set): Likewise.
        (rs6000_expand_vector_extract): Likewise.
        (rs6000_split_vec_extract_var): Likewise.
        (reg_offset_addressing_ok_p): Likewise.
        (rs6000_legitimate_offset_address_p): Likewise.
        (legitimate_lo_sum_address_p): Likewise.
        (rs6000_secondary_reload_simple_move): Likewise.
        (rs6000_preferred_reload_class): Likewise.
        (rs6000_can_change_mode_class): Likewise.
        (rs6000_output_move_128bit): Likewise.
        (rs6000_load_constant_and_splat): Likewise.
        (rs6000_scalar_mode_supported_p): Likewise.
        (rs6000_libgcc_floating_mode_supported_p): Return true for HFmode and
        BFmode if -mfloat16.
        (rs6000_floatn_mode): Enable _Float16 if -mfloat16.
        (rs6000_opt_masks): Add -mfloat16.
        (constant_fp_to_128bit_vector): Add support for 16-bit floating point.
        (vec_const_128bit_to_bytes): Likewise.
        (constant_generates_xxspltiw): Likewise.
        * config/rs6000/rs6000.h (FP16_SCALAR_MODE_P): Ne macro.
        (FP16_VECTOR_MODE_P): Likewise.
        (TARGET_BFLOAT16_HW): New macro.
        (TARGET_FLOAT16_HW): Likewise.
        (TARGET_BFLOAT16_HW_VECTOR): Likewise.
        (TARGET_FLOAT16_HW_VECTOR): Likewise.
        * config/rs6000/rs6000.md (wd): Add BFmode and HFmode.
        (toplevel): Include float16.md.
        * config/rs6000/rs6000.opt (-mloat16): New option.
        * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mfloat16.
---
 gcc/config/rs6000/constraints.md    |   5 +
 gcc/config/rs6000/float16.md        | 159 ++++++++++++++++++
 gcc/config/rs6000/predicates.md     |  78 +++++++++
 gcc/config/rs6000/rs6000-builtin.cc |  20 +++
 gcc/config/rs6000/rs6000-c.cc       |   6 +
 gcc/config/rs6000/rs6000-call.cc    |  20 +++
 gcc/config/rs6000/rs6000-protos.h   |   1 +
 gcc/config/rs6000/rs6000.cc         | 241 +++++++++++++++++++++++++---
 gcc/config/rs6000/rs6000.h          |  20 +++
 gcc/config/rs6000/rs6000.md         |   3 +
 gcc/config/rs6000/rs6000.opt        |   4 +
 gcc/doc/invoke.texi                 |  23 +++
 12 files changed, 554 insertions(+), 26 deletions(-)
 create mode 100644 gcc/config/rs6000/float16.md

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 3da9ed08681..71b49290613 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -222,6 +222,11 @@ (define_constraint "eQ"
   "An IEEE 128-bit constant that can be loaded into VSX registers."
   (match_operand 0 "easy_vector_constant_ieee128"))
 
+;; A negative 0 constant
+(define_constraint "eZ"
+  "A floating point -0.0 constant."
+  (match_operand 0 "minus_zero_constant"))
+
 ;; Floating-point constraints.  These two are defined so that insn
 ;; length attributes can be calculated exactly.
 
diff --git a/gcc/config/rs6000/float16.md b/gcc/config/rs6000/float16.md
new file mode 100644
index 00000000000..1e6339754b2
--- /dev/null
+++ b/gcc/config/rs6000/float16.md
@@ -0,0 +1,159 @@
+;; Machine description for IBM RISC System 6000 (POWER) for GNU C compiler
+;; Copyright (C) 1990-2025 Free Software Foundation, Inc.
+;; Contributed by Richard Kenner ([email protected])
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; Support for _Float16 (HFmode) and __bfloat16 (BFmode)
+
+;; Mode iterator for 16-bit floating point modes both as a scalar and
+;; as a vector.
+(define_mode_iterator FP16     [BF HF])
+(define_mode_iterator VS_FP16  [BF HF V8BF V8HF])
+(define_mode_iterator VFP16    [V8BF V8HF])
+
+;; Mode attribute giving the vector mode for a 16-bit floating point
+;; scalar in both upper and lower case.
+(define_mode_attr FP16_VECTOR8 [(BF "V8BF")
+                               (HF "V8HF")])
+
+(define_mode_attr fp16_vector8 [(BF "v8bf")
+                               (HF "v8hf")])
+
+;; _Float16 and __bfloat16 moves
+(define_expand "mov<mode>"
+  [(set (match_operand:FP16 0 "nonimmediate_operand")
+       (match_operand:FP16 1 "any_operand"))]
+  "TARGET_FLOAT16"
+{
+  if (MEM_P (operands[0]) && !REG_P (operands[1]))
+    operands[1] = force_reg (<MODE>mode, operands[1]);
+})
+
+;; On power10, we can load up HFmode and BFmode constants with xxspltiw
+;; or pli.
+(define_insn "*mov<mode>_xxspltiw"
+  [(set (match_operand:FP16 0 "gpc_reg_operand" "=wa,wa,?r,?r")
+       (match_operand:FP16 1 "fp16_xxspltiw_constant" "j,eP,j,eP"))]
+  "TARGET_FLOAT16
+   && (TARGET_PREFIXED || operands[1] == CONST0_RTX (<MODE>mode))"
+{
+  rtx op1 = operands[1];
+  const REAL_VALUE_TYPE *rtype = CONST_DOUBLE_REAL_VALUE (op1);
+  long real_words[1];
+
+  if (op1 == CONST0_RTX (<MODE>mode))
+    return (!vsx_register_operand (operands[0], <MODE>mode)
+           ? "li %0,0"
+           : "xxlxor %x0,%x0,%x0");
+
+  real_to_target (real_words, rtype, <MODE>mode);
+  operands[2] = GEN_INT (real_words[0]);
+  return (vsx_register_operand (operands[0], <MODE>mode)
+         ? "xxspltiw %x0,%2"
+         : "pli %0,%2");
+}
+  [(set_attr "type"     "veclogical, vecsimple, *,  *")
+   (set_attr "prefixed" "no,         yes,       no, yes")])
+
+;; Handle creating -0.0 if we don't have XXSPLTIW.  For the scalar
+;; modes, we can't do the gen_lowpart call until after register
+;; allocation.
+(define_split
+  [(set (match_operand:VS_FP16 0 "altivec_register_operand")
+       (match_operand:VS_FP16 1 "minus_zero_constant"))]
+  "TARGET_FLOAT16 && reload_completed"
+  [(const_int 0)]
+{
+  int dest_r = reg_or_subregno (operands[0]);
+  rtx dest = gen_rtx_REG (V8HImode, dest_r);
+  size_t nunits = GET_MODE_NUNITS (V8HFmode);
+
+  rtvec v = rtvec_alloc (nunits);
+  for (size_t i = 0; i < nunits; i++)
+    RTVEC_ELT (v, i) = constm1_rtx;
+
+  rs6000_expand_vector_init (dest, gen_rtx_PARALLEL (V8HImode, v));
+  emit_insn (gen_rtx_SET (dest, gen_rtx_ASHIFT (V8HImode, dest, dest)));
+  DONE;
+})
+
+
+(define_insn "*mov<mode>_internal"
+  [(set (match_operand:FP16 0 "nonimmediate_operand"
+                      "=wa,       wa,       Z,         r,          r,
+                        m,        r,        wa,        wa,         r,
+                        v")
+
+       (match_operand:FP16 1 "any_operand"
+                      "wa,        Z,        wa,        r,          m,
+                       r,         wa,       r,         j,          j,
+                       eZ"))]
+  "TARGET_FLOAT16
+   && (gpc_reg_operand (operands[0], <MODE>mode)
+       || gpc_reg_operand (operands[1], <MODE>mode))"
+  "@
+   xxlor %x0,%x1,%x1
+   lxsihzx %x0,%y1
+   stxsihx %x1,%y0
+   mr %0,%1
+   lhz%U1%X1 %0,%1
+   sth%U0%X0 %1,%0
+   mfvsrwz %0,%x1
+   mtvsrwz %x0,%1
+   xxlxor %x0,%x0,%x0
+   li %0,0
+   #"
+  [(set_attr "type"   "vecsimple, fpload,    fpstore,   *,          load,
+                       store,     mtvsr,     mfvsr,     veclogical, *,
+                       vecperm")
+   (set_attr "isa"    "*,         p9v,       p9v,       *,          *,
+                       *,         p8v,       p8v,       p9v,        *,
+                       *")
+   (set_attr "length" "*,         *,         *,         *,          *,
+                       *,         *,         *,         *,          *,
+                       8")])
+
+;; Vector duplicate
+(define_insn "*vecdup<mode>_reg"
+  [(set (match_operand:<FP16_VECTOR8> 0 "altivec_register_operand" "=v")
+       (vec_duplicate:<FP16_VECTOR8>
+        (match_operand:FP16 1 "altivec_register_operand" "v")))]
+  "TARGET_FLOAT16"
+  "vsplth %0,%1,3"
+  [(set_attr "type" "vecperm")])
+
+(define_insn "*vecdup<mode>_const"
+  [(set (match_operand:<FP16_VECTOR8> 0 "vsx_register_operand" "=wa,wa")
+       (vec_duplicate:<FP16_VECTOR8>
+        (match_operand:FP16 1 "fp16_xxspltiw_constant" "j,eP")))]
+  "TARGET_FLOAT16
+   && (TARGET_PREFIXED || operands[1] == CONST0_RTX (<MODE>mode))"
+{
+  rtx op1 = operands[1];
+  if (op1 == CONST0_RTX (<MODE>mode))
+    return "xxlxor %x0,%x0,%x0";
+
+  const REAL_VALUE_TYPE *rtype = CONST_DOUBLE_REAL_VALUE (op1);
+  long real_words[1];
+
+  real_to_target (real_words, rtype, <MODE>mode);
+  operands[2] = GEN_INT (real_words[0]);
+  return "xxspltiw %x0,2";
+}
+  [(set_attr "type" "veclogical,vecperm")
+   (set_attr "prefixed" "*,yes")])
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index f1e03ec30c9..785d09b9423 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -633,6 +633,11 @@ (define_predicate "easy_fp_constant"
   if (TARGET_VSX && op == CONST0_RTX (mode))
     return 1;
 
+  /* If we are on power10, we can use XXSPLTIW to load constants.  On power8
+     and power9, we can use direct move.  */
+  if (FP16_SCALAR_MODE_P (mode))
+    return true;
+
   /* Constants that can be generated with ISA 3.1 instructions are easy.  */
   vec_const_128bit_type vsx_const;
   if (TARGET_POWER10 && vec_const_128bit_to_bytes (op, mode, &vsx_const))
@@ -776,6 +781,10 @@ (define_predicate "easy_vector_constant"
            return true;
        }
 
+      /* -0,0 can be done as VSPLTIH x,-1 and VSLH x,x,x.  */
+      if (FP16_VECTOR_MODE_P (mode) && minus_zero_constant (op, mode))
+       return true;
+
       if (TARGET_P9_VECTOR
           && xxspltib_constant_p (op, mode, &num_insns, &value))
        return true;
@@ -2198,3 +2207,72 @@ (define_predicate "lowpart_subreg_operator"
   (and (match_code "subreg")
        (match_test "subreg_lowpart_offset (mode, GET_MODE (SUBREG_REG (op)))
                    == SUBREG_BYTE (op)")))
+
+;; Return 1 if this is a floating point scalar constant that is -0.0 or
+;; a vector floating point constant where each element is -0.0.
+(define_predicate "minus_zero_constant"
+  (match_code "const_double,vec_duplicate,const_vector")
+{
+  if (GET_CODE (op) == VEC_DUPLICATE)
+    {
+      op = XEXP (op, 0);
+      if (!CONST_DOUBLE_P (op))
+       return false;
+
+      mode = GET_MODE (op);
+    }
+
+  /* Scalar or vector filled with duplicates.  */
+  if (CONST_DOUBLE_P (op))
+    {
+      if (!SCALAR_FLOAT_MODE_P (mode))
+       return false;
+
+      const REAL_VALUE_TYPE *rtype = CONST_DOUBLE_REAL_VALUE (op);
+      return real_isnegzero (rtype);
+    }
+
+  /* Vector constant, check all elements.  */
+  else if (CONST_VECTOR_P (op))
+    {
+      if (GET_MODE_CLASS (mode) != MODE_VECTOR_FLOAT)
+       return false;
+
+      size_t nunits = GET_MODE_NUNITS (mode);
+      for (size_t i = 0; i < nunits; i++)
+       {
+         rtx ele = CONST_VECTOR_ELT (op, i);
+         if (!CONST_DOUBLE_P (ele))
+           return false;
+
+         const REAL_VALUE_TYPE *rtype = CONST_DOUBLE_REAL_VALUE (ele);
+         if (!real_isnegzero (rtype))
+           return false;
+       }
+
+      return true;
+    }
+
+  return false;
+})
+    
+;; Return 1 if this is a 16-bit floating point constant that can be
+;; loaded with XXSPLTIW or is 0.0 that can be loaded with XXSPLTIB.
+(define_predicate "fp16_xxspltiw_constant"
+  (match_code "const_double")
+{
+  if (!FP16_SCALAR_MODE_P (mode))
+    return false;
+
+  if (op == CONST0_RTX (mode))
+    return true;
+
+  if (!TARGET_PREFIXED)
+    return false;
+
+  vec_const_128bit_type vsx_const;
+  if (!vec_const_128bit_to_bytes (op, mode, &vsx_const))
+    return false;
+
+  return constant_generates_xxspltiw (&vsx_const);
+})
diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index a02e4cd03ef..c42a50944ac 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -491,6 +491,10 @@ const char *rs6000_type_string (tree type_node)
     return "voidc*";
   else if (type_node == float128_type_node)
     return "_Float128";
+  else if (type_node == float16_type_node)
+    return "_Float16";
+  else if (TARGET_FLOAT16 && type_node == bfloat16_type_node)
+    return "__bfloat16";
   else if (type_node == vector_pair_type_node)
     return "__vector_pair";
   else if (type_node == vector_quad_type_node)
@@ -758,6 +762,22 @@ rs6000_init_builtins (void)
   else
     ieee128_float_type_node = NULL_TREE;
 
+  /* __bfloat16 support.  */
+  if (TARGET_FLOAT16)
+    {
+      if (!bfloat16_type_node)
+       {
+         bfloat16_type_node = make_node (REAL_TYPE);
+         TYPE_PRECISION (bfloat16_type_node) = 16;
+         SET_TYPE_MODE (bfloat16_type_node, BFmode);
+         layout_type (bfloat16_type_node);
+         t = build_qualified_type (bfloat16_type_node, TYPE_QUAL_CONST);
+       }
+
+      lang_hooks.types.register_builtin_type (bfloat16_type_node,
+                                             "__bfloat16");
+    }
+
   /* Vector pair and vector quad support.  */
   vector_pair_type_node = make_node (OPAQUE_TYPE);
   SET_TYPE_MODE (vector_pair_type_node, OOmode);
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index e202fd6c7df..70e6c6ebdc9 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -583,6 +583,12 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
   if ((flags & OPTION_MASK_FLOAT128_HW) != 0)
     rs6000_define_or_undefine_macro (define_p, "__FLOAT128_HARDWARE__");
 
+  /* 16-bit floating point support.  */
+  if ((flags & OPTION_MASK_FLOAT16) != 0)
+    {
+      rs6000_define_or_undefine_macro (define_p, "__FLOAT16__");
+      rs6000_define_or_undefine_macro (define_p, "__BFLOAT16__");
+    }
   /* Tell the user if we are targeting CELL.  */
   if (rs6000_cpu == PROCESSOR_CELL)
     rs6000_define_or_undefine_macro (define_p, "__PPU__");
diff --git a/gcc/config/rs6000/rs6000-call.cc b/gcc/config/rs6000/rs6000-call.cc
index 7541050ffe7..1c5bec25ecb 100644
--- a/gcc/config/rs6000/rs6000-call.cc
+++ b/gcc/config/rs6000/rs6000-call.cc
@@ -685,6 +685,18 @@ init_cumulative_args (CUMULATIVE_ARGS *cum, tree fntype,
             " altivec instructions are disabled, use %qs"
             " to enable them", "-maltivec");
     }
+
+  /* Warn that __bfloat16 and _Float16 might be returned differently in the
+     future.  The issue is currently 16-bit floating point is returned in
+     floating point register #1 in 16-bit format.  We may or may not want to
+     return it as a scalar 64-bit value.  */
+  if (fntype && warn_psabi && !cum->libcall)
+    {
+      machine_mode ret_mode = TYPE_MODE (TREE_TYPE (fntype));
+      if (ret_mode == BFmode || ret_mode == HFmode)
+       warning (OPT_Wpsabi, "%s might be returned differently in the future",
+                ret_mode == BFmode ? "__bfloat16" : "_Float16");
+    }
 }
 
 
@@ -1643,6 +1655,14 @@ rs6000_function_arg (cumulative_args_t cum_v, const 
function_arg_info &arg)
       return NULL_RTX;
     }
 
+  /* Warn that _Float16 and __bfloat16 might be passed differently in the
+     future.  The issue is currently 16-bit floating point values are passed in
+     floating point registers in the native 16-bit format.  We may or may not
+     want to pass the value it as a scalar 64-bit value.  */
+  if (warn_psabi && !cum->libcall && (mode == BFmode || mode == HFmode))
+    warning (OPT_Wpsabi, "%s might be passed differently in the future",
+            mode == BFmode ? "__bfloat16" : "_Float16");
+
   /* Return a marker to indicate whether CR1 needs to set or clear the
      bit that V.4 uses to say fp args were passed in registers.
      Assume that we don't need the marker for software floating point,
diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 234eb0ae2b3..d29081837b3 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -250,6 +250,7 @@ typedef struct {
   bool all_words_same;                 /* Are the words all equal?  */
   bool all_half_words_same;            /* Are the half words all equal?  */
   bool all_bytes_same;                 /* Are the bytes all equal?  */
+  machine_mode mode;                   /* Original constant mode.  */
 } vec_const_128bit_type;
 
 extern bool vec_const_128bit_to_bytes (rtx, machine_mode,
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 052e95d5a39..8004f6449ac 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -1929,7 +1929,8 @@ rs6000_hard_regno_mode_ok_uncached (int regno, 
machine_mode mode)
 
       if (ALTIVEC_REGNO_P (regno))
        {
-         if (GET_MODE_SIZE (mode) < 16 && !reg_addr[mode].scalar_in_vmx_p)
+         if (GET_MODE_SIZE (mode) < 16 && !reg_addr[mode].scalar_in_vmx_p
+             && !FP16_SCALAR_MODE_P (mode))
            return 0;
 
          return ALTIVEC_REGNO_P (last_regno);
@@ -2022,7 +2023,8 @@ rs6000_modes_tieable_p (machine_mode mode1, machine_mode 
mode2)
 {
   if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
       || mode1 == TDOmode || mode2 == PTImode || mode2 == OOmode
-      || mode2 == XOmode || mode2 == TDOmode)
+      || mode2 == XOmode || mode2 == TDOmode
+      || FP16_SCALAR_MODE_P (mode1) || FP16_SCALAR_MODE_P (mode2))
     return mode1 == mode2;
 
   if (ALTIVEC_OR_VSX_VECTOR_MODE (mode1))
@@ -2287,6 +2289,8 @@ rs6000_debug_reg_global (void)
     DImode,
     TImode,
     PTImode,
+    BFmode,
+    HFmode,
     SFmode,
     DFmode,
     TFmode,
@@ -2669,8 +2673,14 @@ rs6000_setup_reg_addr_masks (void)
 
       /* SDmode is special in that we want to access it only via REG+REG
         addressing on power7 and above, since we want to use the LFIWZX and
-        STFIWZX instructions to load it.  */
-      bool indexed_only_p = (m == SDmode && TARGET_NO_SDMODE_STACK);
+        STFIWZX instructions to load it.
+
+        Never allow offset addressing for 16-bit floating point modes, since
+        it is expected that 16-bit floating point should always go into the
+        vector registers and we only have indexed and indirect 16-bit loads to
+        VSR registers.  */
+      bool indexed_only_p = ((m == SDmode && TARGET_NO_SDMODE_STACK)
+                            || FP16_SCALAR_MODE_P (m));
 
       any_addr_mask = 0;
       for (rc = FIRST_RELOAD_REG_CLASS; rc <= LAST_RELOAD_REG_CLASS; rc++)
@@ -2734,6 +2744,7 @@ rs6000_setup_reg_addr_masks (void)
                  && !complex_p
                  && (m != E_DFmode || !TARGET_VSX)
                  && (m != E_SFmode || !TARGET_P8_VECTOR)
+                 && !FP16_SCALAR_MODE_P (m)
                  && !small_int_vsx_p)
                {
                  addr_mask |= RELOAD_REG_PRE_INCDEC;
@@ -2991,6 +3002,15 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
       rs6000_vector_align[V1TImode] = 128;
     }
 
+  /* _Float16 support.  */
+  if (TARGET_FLOAT16)
+    {
+      rs6000_vector_mem[HFmode] = VECTOR_VSX;
+      rs6000_vector_mem[BFmode] = VECTOR_VSX;
+      rs6000_vector_align[HFmode] = 16;
+      rs6000_vector_align[BFmode] = 16;
+    }
+
   /* DFmode, see if we want to use the VSX unit.  Memory is handled
      differently, so don't set rs6000_vector_mem.  */
   if (TARGET_VSX)
@@ -3119,6 +3139,14 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
              reg_addr[TFmode].reload_load  = CODE_FOR_reload_tf_di_load;
            }
 
+         if (TARGET_FLOAT16)
+           {
+             reg_addr[HFmode].reload_store = CODE_FOR_reload_hf_di_store;
+             reg_addr[BFmode].reload_store = CODE_FOR_reload_bf_di_store;
+             reg_addr[HFmode].reload_load  = CODE_FOR_reload_hf_di_load;
+             reg_addr[BFmode].reload_load  = CODE_FOR_reload_bf_di_load;
+           }
+
          /* Only provide a reload handler for SDmode if lfiwzx/stfiwx are
             available.  */
          if (TARGET_NO_SDMODE_STACK)
@@ -3219,6 +3247,14 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
              reg_addr[TFmode].reload_load  = CODE_FOR_reload_tf_si_load;
            }
 
+         if (TARGET_FLOAT16)
+           {
+             reg_addr[HFmode].reload_store = CODE_FOR_reload_hf_si_store;
+             reg_addr[BFmode].reload_store = CODE_FOR_reload_bf_si_store;
+             reg_addr[HFmode].reload_load  = CODE_FOR_reload_hf_si_load;
+             reg_addr[BFmode].reload_load  = CODE_FOR_reload_bf_si_load;
+           }
+
          /* Only provide a reload handler for SDmode if lfiwzx/stfiwx are
             available.  */
          if (TARGET_NO_SDMODE_STACK)
@@ -3964,6 +4000,19 @@ rs6000_option_override_internal (bool global_init_p)
        }
     }
 
+  /* 16-bit floating point needs 64-bit power8 at a minimum in order to load up
+     16-bit values into vector registers via loads/stores from GPRs and then
+     using direct moves.  Don't allow 16-bit float on big endian systems at the
+     current time.  */
+  if (TARGET_FLOAT16 && (!TARGET_DIRECT_MOVE_64BIT || BYTES_BIG_ENDIAN))
+    {
+      rs6000_isa_flags &= ~OPTION_MASK_FLOAT16;
+      if (rs6000_isa_flags_explicit & OPTION_MASK_FLOAT16)
+       error ("%qs is only available on 64-bit little endian systems "
+              "that use at least %qs",
+              "-mfloat16", "-mcpu=power8");
+    }
+
   /* If hard-float/altivec/vsx were explicitly turned off then don't allow
      the -mcpu setting to enable options that conflict. */
   if ((!TARGET_HARD_FLOAT || !TARGET_ALTIVEC || !TARGET_VSX)
@@ -6554,6 +6603,12 @@ easy_altivec_constant (rtx op, machine_mode mode)
       return 0;
     }
 
+  /* For 16-bit floating point vectors, only allow 0.0 and -0.0 as easy altivec
+     constants.  */
+  if (FP16_VECTOR_MODE_P (mode))
+    return (zero_constant (op, mode) || minus_zero_constant (op, mode)
+           ? 8 : 0);
+
   /* V1TImode is a special container for TImode.  Ignore for now.  */
   else if (mode == V1TImode)
     return 0;
@@ -6661,6 +6716,12 @@ xxspltib_constant_p (rtx op,
   /* Handle (vec_duplicate <constant>).  */
   if (GET_CODE (op) == VEC_DUPLICATE)
     {
+      element = XEXP (op, 0);
+
+      /* For 16-bit floating point, the only valid use is xxspltib is 0.0.  */
+      if (FP16_VECTOR_MODE_P (mode))
+       return element == CONST0_RTX (GET_MODE_INNER (mode));
+
       if (mode != V16QImode && mode != V8HImode && mode != V4SImode
          && mode != V2DImode)
        return false;
@@ -6677,6 +6738,20 @@ xxspltib_constant_p (rtx op,
   /* Handle (const_vector [...]).  */
   else if (GET_CODE (op) == CONST_VECTOR)
     {
+      /* For V8BFmode & V8HFmode, the only valid use is xxspltib is 0.0.  */
+      if (FP16_VECTOR_MODE_P (mode))
+       {
+         if (op == CONST0_RTX (mode))
+           return true;
+
+         rtx zero = CONST0_RTX (GET_MODE_INNER (mode));
+         for (i = 0; i < nunits; i++)
+           if (CONST_VECTOR_ELT (op, i) != zero)
+             return false;
+
+         return true;
+       }
+
       if (mode != V16QImode && mode != V8HImode && mode != V4SImode
          && mode != V2DImode)
        return false;
@@ -7123,6 +7198,15 @@ rs6000_expand_vector_init (rtx target, rtx vals)
       return;
     }
 
+  /* Special case splats of 16-bit floating point.  */
+  if (all_same && FP16_VECTOR_MODE_P (mode))
+    {
+      rtx op0 = force_reg (GET_MODE_INNER (mode), XVECEXP (vals, 0, 0));
+      rtx dup = gen_rtx_VEC_DUPLICATE (mode, op0);
+      emit_insn (gen_rtx_SET (target, dup));
+      return;
+    }
+                                                    
   /* Special case initializing vector short/char that are splats if we are on
      64-bit systems with direct move.  */
   if (all_same && TARGET_DIRECT_MOVE_64BIT
@@ -7187,8 +7271,7 @@ rs6000_expand_vector_init (rtx target, rtx vals)
     }
 
   if (TARGET_DIRECT_MOVE
-      && (mode == V16QImode || mode == V8HImode || mode == V8HFmode
-         || mode == V8BFmode))
+      && (mode == V16QImode || mode == V8HImode || FP16_VECTOR_MODE_P (mode)))
     {
       rtx op[16];
       /* Force the values into word_mode registers.  */
@@ -7619,6 +7702,10 @@ rs6000_expand_vector_set (rtx target, rtx val, rtx 
elt_rtx)
            insn = gen_vsx_set_v4si_p9 (target, target, val, elt_rtx);
          else if (mode == V8HImode)
            insn = gen_vsx_set_v8hi_p9 (target, target, val, elt_rtx);
+         else if (mode == V8HFmode)
+           insn = gen_vsx_set_v8hf_p9 (target, target, val, elt_rtx);
+         else if (mode == V8BFmode)
+           insn = gen_vsx_set_v8bf_p9 (target, target, val, elt_rtx);
          else if (mode == V16QImode)
            insn = gen_vsx_set_v16qi_p9 (target, target, val, elt_rtx);
          else if (mode == V4SFmode)
@@ -7736,6 +7823,22 @@ rs6000_expand_vector_extract (rtx target, rtx vec, rtx 
elt)
            }
          else
            break;
+       case E_V8HFmode:
+         if (TARGET_DIRECT_MOVE_64BIT)
+           {
+             emit_insn (gen_vsx_extract_v8hf (target, vec, elt));
+             return;
+           }
+         else
+           break;
+       case E_V8BFmode:
+         if (TARGET_DIRECT_MOVE_64BIT)
+           {
+             emit_insn (gen_vsx_extract_v8bf (target, vec, elt));
+             return;
+           }
+         else
+           break;
        case E_V4SImode:
          if (TARGET_DIRECT_MOVE_64BIT)
            {
@@ -7783,6 +7886,14 @@ rs6000_expand_vector_extract (rtx target, rtx vec, rtx 
elt)
          emit_insn (gen_vsx_extract_v8hi_var (target, vec, elt));
          return;
 
+       case E_V8HFmode:
+         emit_insn (gen_vsx_extract_v8hf_var (target, vec, elt));
+         return;
+
+       case E_V8BFmode:
+         emit_insn (gen_vsx_extract_v8bf_var (target, vec, elt));
+         return;
+
        case E_V16QImode:
          emit_insn (gen_vsx_extract_v16qi_var (target, vec, elt));
          return;
@@ -8060,7 +8171,10 @@ rs6000_split_vec_extract_var (rtx dest, rtx src, rtx 
element, rtx tmp_gpr,
       /* See if we want to generate VEXTU{B,H,W}{L,R}X if the destination is in
         a general purpose register.  */
       if (TARGET_P9_VECTOR
-         && (mode == V16QImode || mode == V8HImode || mode == V4SImode)
+         && (mode == V16QImode
+             || mode == V8HImode
+             || mode == V4SImode
+             || FP16_VECTOR_MODE_P (mode))
          && INT_REGNO_P (dest_regno)
          && ALTIVEC_REGNO_P (src_regno)
          && INT_REGNO_P (element_regno))
@@ -8073,7 +8187,7 @@ rs6000_split_vec_extract_var (rtx dest, rtx src, rtx 
element, rtx tmp_gpr,
                       ? gen_vextublx (dest_si, element_si, src)
                       : gen_vextubrx (dest_si, element_si, src));
 
-         else if (mode == V8HImode)
+         else if (mode == V8HImode || FP16_VECTOR_MODE_P (mode))
            {
              rtx tmp_gpr_si = gen_rtx_REG (SImode, REGNO (tmp_gpr));
              emit_insn (gen_ashlsi3 (tmp_gpr_si, element_si, const1_rtx));
@@ -8175,6 +8289,8 @@ rs6000_split_vec_extract_var (rtx dest, rtx src, rtx 
element, rtx tmp_gpr,
 
        case E_V4SImode:
        case E_V8HImode:
+       case E_V8HFmode:
+       case E_V8BFmode:
        case E_V16QImode:
          {
            rtx tmp_altivec_di = gen_rtx_REG (DImode, REGNO (tmp_altivec));
@@ -8792,6 +8908,13 @@ reg_offset_addressing_ok_p (machine_mode mode)
     case E_TDOmode:
       return TARGET_DENSE_MATH;
 
+      /* For 16-bit floating point types, do not allow offset addressing, since
+        it is assumed that most of the use will be in vector registers, and we
+        only have reg+reg addressing for 16-bit modes.  */
+    case E_BFmode:
+    case E_HFmode:
+      return false;
+
     case E_SDmode:
       /* If we can do direct load/stores of SDmode, restrict it to reg+reg
         addressing for the LFIWZX and STFIWX instructions.  */
@@ -9076,6 +9199,13 @@ rs6000_legitimate_offset_address_p (machine_mode mode, 
rtx x,
   extra = 0;
   switch (mode)
     {
+      /* For 16-bit floating point types, do not allow offset addressing, since
+        it is assumed that most of the use will be in vector registers, and we
+        only have reg+reg addressing for 16-bit modes.  */
+    case E_BFmode:
+    case E_HFmode:
+      return false;
+
     case E_DFmode:
     case E_DDmode:
     case E_DImode:
@@ -9177,6 +9307,11 @@ macho_lo_sum_memory_operand (rtx x, machine_mode mode)
 static bool
 legitimate_lo_sum_address_p (machine_mode mode, rtx x, int strict)
 {
+      /* For 16-bit floating point types, do not allow offset addressing, since
+        it is assumed that most of the use will be in vector registers, and we
+        only have reg+reg addressing for 16-bit modes.  */
+  if (FP16_SCALAR_MODE_P (mode))
+    return false;
   if (GET_CODE (x) != LO_SUM)
     return false;
   if (!REG_P (XEXP (x, 0)))
@@ -12784,6 +12919,9 @@ rs6000_secondary_reload_simple_move (enum 
rs6000_reg_type to_type,
       && ((to_type == GPR_REG_TYPE && from_type == VSX_REG_TYPE)
          || (to_type == VSX_REG_TYPE && from_type == GPR_REG_TYPE)))
     {
+      if (FP16_SCALAR_MODE_P (mode))
+       return true;
+
       if (TARGET_POWERPC64)
        {
          /* ISA 2.07: MTVSRD or MVFVSRD.  */
@@ -13582,6 +13720,11 @@ rs6000_preferred_reload_class (rtx x, enum reg_class 
rclass)
          || mode_supports_dq_form (mode))
        return rclass;
 
+      /* IEEE 16-bit and bfloat16 don't support offset addressing, but they can
+        go in any floating point/vector register.  */
+      if (FP16_SCALAR_MODE_P (mode))
+       return rclass;
+
       /* If this is a scalar floating point value and we don't have D-form
         addressing, prefer the traditional floating point registers so that we
         can use D-form (register+offset) addressing.  */
@@ -13599,6 +13742,16 @@ rs6000_preferred_reload_class (rtx x, enum reg_class 
rclass)
       return rclass;
     }
 
+  /* For 16-bit floating point scalar modes, if we have lxsihzx/stxsihzx from
+     Power9, prefer the vector registers.  On power8, we will need to use GPRs
+     to do load/store.  For 16-bit floating point vector modes, only prefer
+     VSX.  */
+  if (FP16_VECTOR_MODE_P (mode))
+    return TARGET_P9_VECTOR ? VSX_REGS : rclass;
+
+  if (FP16_VECTOR_MODE_P (mode))
+    return VSX_REGS;
+
   if (is_constant || GET_CODE (x) == PLUS)
     {
       if (reg_class_subset_p (GENERAL_REGS, rclass))
@@ -13819,6 +13972,9 @@ rs6000_can_change_mode_class (machine_mode from,
   unsigned from_size = GET_MODE_SIZE (from);
   unsigned to_size = GET_MODE_SIZE (to);
 
+  if (FP16_SCALAR_MODE_P (from) || FP16_SCALAR_MODE_P (to))
+    return from_size == to_size;
+
   if (from_size != to_size)
     {
       enum reg_class xclass = (TARGET_VSX) ? VSX_REGS : FLOAT_REGS;
@@ -14036,7 +14192,8 @@ rs6000_output_move_128bit (rtx operands[])
          else if (TARGET_P9_VECTOR)
            return "lxvx %x0,%y1";
 
-         else if (mode == V16QImode || mode == V8HImode || mode == V4SImode)
+         else if (mode == V16QImode || mode == V8HImode || mode == V4SImode
+                  || FP16_VECTOR_MODE_P (mode))
            return "lxvw4x %x0,%y1";
 
          else
@@ -14074,7 +14231,8 @@ rs6000_output_move_128bit (rtx operands[])
          else if (TARGET_P9_VECTOR)
            return "stxvx %x1,%y0";
 
-         else if (mode == V16QImode || mode == V8HImode || mode == V4SImode)
+         else if (mode == V16QImode || mode == V8HImode || mode == V4SImode
+                  || FP16_VECTOR_MODE_P (mode))
            return "stxvw4x %x1,%y0";
 
          else
@@ -23161,7 +23319,7 @@ rs6000_load_constant_and_splat (machine_mode mode, 
REAL_VALUE_TYPE dconst)
 {
   rtx reg;
 
-  if (mode == SFmode || mode == DFmode)
+  if (mode == SFmode || mode == DFmode || FP16_SCALAR_MODE_P (mode))
     {
       rtx d = const_double_from_real_value (dconst, mode);
       reg = force_reg (mode, d);
@@ -24494,6 +24652,8 @@ rs6000_scalar_mode_supported_p (scalar_mode mode)
     return default_decimal_float_supported_p ();
   else if (TARGET_FLOAT128_TYPE && (mode == KFmode || mode == IFmode))
     return true;
+  else if (FP16_SCALAR_MODE_P (mode))
+    return true;
   else
     return default_scalar_mode_supported_p (mode);
 }
@@ -24518,6 +24678,10 @@ rs6000_libgcc_floating_mode_supported_p 
(scalar_float_mode mode)
     case E_KFmode:
       return TARGET_FLOAT128_TYPE && !TARGET_IEEEQUAD;
 
+    case E_BFmode:
+    case E_HFmode:
+      return TARGET_FLOAT16;
+
     default:
       return false;
     }
@@ -24545,6 +24709,9 @@ rs6000_floatn_mode (int n, bool extended)
     {
       switch (n)
        {
+       case 16:
+         return TARGET_FLOAT16 ? SFmode : opt_scalar_float_mode ();
+
        case 32:
          return DFmode;
 
@@ -24566,6 +24733,9 @@ rs6000_floatn_mode (int n, bool extended)
     {
       switch (n)
        {
+       case 16:
+         return TARGET_FLOAT16 ? HFmode : opt_scalar_float_mode ();
+
        case 32:
          return SFmode;
 
@@ -24689,6 +24859,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
   { "power11",                 OPTION_MASK_POWER11,            false, false },
   { "hard-dfp",                        OPTION_MASK_DFP,                false, 
true  },
   { "htm",                     OPTION_MASK_HTM,                false, true  },
+  { "float16",                 OPTION_MASK_FLOAT16,            false, true  },
   { "isel",                    OPTION_MASK_ISEL,               false, true  },
   { "mfcrf",                   OPTION_MASK_MFCRF,              false, true  },
   { "mfpgpr",                  0,                              false, true  },
@@ -29117,24 +29288,37 @@ constant_fp_to_128bit_vector (rtx op,
   const REAL_VALUE_TYPE *rtype = CONST_DOUBLE_REAL_VALUE (op);
   long real_words[VECTOR_128BIT_WORDS];
 
-  /* Make sure we don't overflow the real_words array and that it is
-     filled completely.  */
-  gcc_assert (num_words <= VECTOR_128BIT_WORDS && (bitsize % 32) == 0);
-
-  real_to_target (real_words, rtype, mode);
+  /* For 16-bit floating point, the constant doesn't fill the whole 32-bit
+     word.  Deal with it here, storing the bytes in big endian fashion.  */
+  if (FP16_SCALAR_MODE_P (mode))
+    {
+      real_to_target (real_words, rtype, mode);
+      info->bytes[byte_num] = (unsigned char) (real_words[0] >> 8);
+      info->bytes[byte_num+1] = (unsigned char) (real_words[0]);
+    }
 
-  /* Iterate over each 32-bit word in the floating point constant.  The
-     real_to_target function puts out words in target endian fashion.  We need
-     to arrange the order so that the bytes are written in big endian order.  
*/
-  for (unsigned num = 0; num < num_words; num++)
+  else
     {
-      unsigned endian_num = (BYTES_BIG_ENDIAN
-                            ? num
-                            : num_words - 1 - num);
+      /* Make sure we don't overflow the real_words array and that it is filled
+        completely.  */
+      gcc_assert (num_words <= VECTOR_128BIT_WORDS && (bitsize % 32) == 0);
 
-      unsigned uvalue = real_words[endian_num];
-      for (int shift = 32 - 8; shift >= 0; shift -= 8)
-       info->bytes[byte_num++] = (uvalue >> shift) & 0xff;
+      real_to_target (real_words, rtype, mode);
+
+      /* Iterate over each 32-bit word in the floating point constant.  The
+        real_to_target function puts out words in target endian fashion.  We
+        need to arrange the order so that the bytes are written in big endian
+        order.  */
+      for (unsigned num = 0; num < num_words; num++)
+       {
+         unsigned endian_num = (BYTES_BIG_ENDIAN
+                                ? num
+                                : num_words - 1 - num);
+
+         unsigned uvalue = real_words[endian_num];
+         for (int shift = 32 - 8; shift >= 0; shift -= 8)
+           info->bytes[byte_num++] = (uvalue >> shift) & 0xff;
+       }
     }
 
   /* Mark that this constant involves floating point.  */
@@ -29173,6 +29357,7 @@ vec_const_128bit_to_bytes (rtx op,
     return false;
 
   /* Set up the bits.  */
+  info->mode = mode;
   switch (GET_CODE (op))
     {
       /* Integer constants, default to double word.  */
@@ -29400,6 +29585,10 @@ constant_generates_xxspltiw (vec_const_128bit_type 
*vsx_const)
   if (!TARGET_SPLAT_WORD_CONSTANT || !TARGET_PREFIXED || !TARGET_VSX)
     return 0;
 
+  /* HFmode/BFmode constants can always use XXSPLTIW.  */
+  if (FP16_SCALAR_MODE_P (vsx_const->mode))
+    return 1;
+
   if (!vsx_const->all_words_same)
     return 0;
 
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index abbc9c1c4cf..42629348257 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -343,6 +343,26 @@ extern const char *host_detect_local_cpu (int argc, const 
char **argv);
    || ((MODE) == TDmode)                                               \
    || (!TARGET_FLOAT128_TYPE && FLOAT128_IEEE_P (MODE)))
 
+/* Is this a valid 16-bit scalar floating point mode?  */
+#define FP16_SCALAR_MODE_P(MODE)                                       \
+  (TARGET_FLOAT16 && ((MODE) == HFmode || (MODE) == BFmode))
+
+/* Is this a valid 16-bit scalar floating point mode?  */
+#define FP16_VECTOR_MODE_P(MODE)                                       \
+  (TARGET_FLOAT16 && ((MODE) == V8HFmode || (MODE) == V8BFmode))
+
+/* Do we have conversion support in hardware for the 16-bit floating point?  */
+#define TARGET_BFLOAT16_HW     (TARGET_FLOAT16 && TARGET_POWER10)
+#define TARGET_FLOAT16_HW      (TARGET_FLOAT16 && TARGET_P9_VECTOR)
+
+/* Do we have conversion support in hardware for the 16-bit floating point and
+   also enable the 16-bit floating point vector optimizations?  */
+#define TARGET_BFLOAT16_HW_VECTOR                                      \
+  (TARGET_FLOAT16 && TARGET_POWER10 && TARGET_BFLOAT16_VECTOR)
+
+#define TARGET_FLOAT16_HW_VECTOR                                       \
+  (TARGET_FLOAT16 && TARGET_POWER9 && TARGET_FLOAT16_VECTOR)
+
 /* Return true for floating point that does not use a vector register.  */
 #define SCALAR_FLOAT_MODE_NOT_VECTOR_P(MODE)                           \
   (SCALAR_FLOAT_MODE_P (MODE) && !FLOAT128_VECTOR_P (MODE))
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index b188a0e6910..6f2d6cb9023 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -716,6 +716,8 @@ (define_code_attr uns [(fix         "")
 ; A generic w/d attribute, for things like cmpw/cmpd.
 (define_mode_attr wd [(QI    "b")
                      (HI    "h")
+                     (BF    "h")
+                     (HF    "h")
                      (SI    "w")
                      (DI    "d")
                      (V16QI "b")
@@ -15893,3 +15895,4 @@ (define_insn "hashchk"
 (include "htm.md")
 (include "fusion.md")
 (include "pcrel-opt.md")
+(include "float16.md")
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 72578644037..588e4739f6b 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -643,6 +643,10 @@ mdense_math
 Target Mask(DENSE_MATH) Var(rs6000_isa_flags)
 Generate (do not generate) dense math MMA+ instructions.
 
+mfloat16
+Target Mask(FLOAT16) Var(rs6000_isa_flags)
+Enable or disable 16-bit floating point.
+
 ; Documented parameters
 
 -param=rs6000-vect-unroll-limit=
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 42bd76817b8..4a0214e4e7e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1421,6 +1421,7 @@ See RS/6000 and PowerPC Options.
 -mquad-memory-atomic  -mno-quad-memory-atomic
 -mcompat-align-parm  -mno-compat-align-parm
 -mfloat128  -mno-float128  -mfloat128-hardware  -mno-float128-hardware
+-mfloat16 -mno-float16
 -mgnu-attribute  -mno-gnu-attribute
 -mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{reg}
 -mstack-protector-guard-offset=@var{offset} -mprefixed -mno-prefixed
@@ -32540,6 +32541,28 @@ The default for @option{-mfloat128-hardware} is 
enabled on PowerPC
 Linux systems using the ISA 3.0 instruction set, and disabled on other
 systems.
 
+
+@opindex mfloat16
+@opindex mno-float16
+@item -mfloat16
+@itemx -mno-float16
+Enable/disable both the @code{_Float16} and @code{__bfloat16} keywords
+for using 16-bit floating point.
+
+The @code{_Float16} keyword is for IEEE 16-bit floating point and GCC
+generates either software emulation for IEEE 16-bit floating point or
+hardware instructions.
+
+The @code{__bfloat16} keyword is for Google brain 16-bit floating
+point and GCC generates either software emulation for Google brain
+16-bit floating point or hardware instructions.
+
+At the current time, 16-bit floating point support is experimental,
+and support may be changed in future releases.  If you pass or return
+a 16-bit floating point value, GCC will issue a warning that the ABI
+may change in the future unless you use the @option{-Wno-psabi}
+option.
+
 @opindex m32
 @opindex m64
 @item -m32
-- 
2.51.1


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: [email protected]

Reply via email to