u99127 commented on a change in pull request #4828: [QNN][TFLite] TFLite
rounding mode support
URL: https://github.com/apache/incubator-tvm/pull/4828#discussion_r376672419
##########
File path: src/relay/qnn/util.cc
##########
@@ -22,13 +22,47 @@
* \brief Utility functions for QNN.
*/
+#include <limits>
#include "util.h"
#include "../pass/pattern_util.h"
namespace tvm {
namespace relay {
namespace qnn {
+/* \brief This function implements the rounding part of ARMv7 NEON VQRDMULH
+ * instruction. For code reuse, the multiplied tensor is directly passed in
+ * as parameter.
+ */
+Expr SaturatingRoundingDoublingHigh32(const Expr& input_tensor,
+ const Expr& multiplier_expr,
+ const Expr& scaled_tensor,
+ const Array<IndexExpr>& input_shape) {
+ DataType hp_dtype = DataType::Int(64);
+ int32_t pos_nudge_value = (1ll << 30);
+ int32_t neg_nudge_value = 1 - (1ll << 30);
+ auto pos_nudge = MakeConstantScalar(hp_dtype, pos_nudge_value);
+ auto neg_nudge = MakeConstantScalar(hp_dtype, neg_nudge_value);
+ auto pos_nudge_t = Full(pos_nudge, input_shape, hp_dtype);
+ auto neg_nudge_t = Full(neg_nudge, input_shape, hp_dtype);
+
+ auto int32_min = MakeConstantScalar(
+ hp_dtype, std::numeric_limits<std::int32_t>::min());
+ auto int32_max = MakeConstantScalar(
+ hp_dtype, std::numeric_limits<std::int32_t>::max());
+ auto int32_min_t = Full(int32_min, input_shape, hp_dtype);
+ auto int32_max_t = Full(int32_max, input_shape, hp_dtype);
Review comment:
While this generic lowering works I cannot help but ask the obvious question
why the default implementation shouldn't look to produce the vqrdmulh
instruction on Arm Neon if it can ?
regards
Ramana
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services