cconvey commented on code in PR #12204: URL: https://github.com/apache/tvm/pull/12204#discussion_r949329234
########## src/runtime/hexagon/ops/conv2d_hvx.cc: ########## @@ -0,0 +1,473 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +#include <HAP_compute_res.h> +#include <hexagon_types.h> +#include <hvx_hexagon_protos.h> +#include <tvm/runtime/c_runtime_api.h> +#include <tvm/runtime/device_api.h> + +#include <algorithm> +#include <cassert> +#include <cinttypes> + +#include "tvm/runtime/hexagon/ops/conv2d.h" + +// Current limitations: +// - N in NHWC must be 1 +// - dilated convolutions are not supported +// - Bias is not accepted +// - Optional "relu" is not performed + +// Packed arguments: +// 0: DLTensor activations (NHWC) +// 1: DLTensor weights (HWIO) +// 2: int offset_top +// 3: int offset_left +// 4: int stride_h +// 5: int stride_w +// 6: DLTensor output (NHWC) +extern "C" int conv2d_packed(TVMValue* args, int* type_codes, int num_args, TVMValue* out_val, + int out_code, void* res_handle); + +namespace tvm { +namespace runtime { +namespace hexagon { + +inline uint16_t* getElementPtr(int block_out_y, int block_out_x, int block_out_c, int yi, int xio, + int ci, int xii, const DLTensor& block) { + auto block_ptr = nhwc_at(block, 0, block_out_y, block_out_x, block_out_c); + auto block_offset = yi * 128 + xio * 64 + ci * 2 + xii; + auto first_element_ptr = reinterpret_cast<uint16_t*>(block_ptr); + return first_element_ptr + block_offset; +} Review Comment: This function's name and argument list suggest that it handles tensors with a variety of element dtypes. But its body and return type assume `uint16_t`. Is there some reason that's a safe assumption in this context? If yes, then it might be helpful to add some comments explaining why. If not, then I'd suggest some form of disambiguation, e.g.: - rename this to `getElementPtr_u16`, or - convert this to a template that's parameterized by dtype. Even if this `uint16_t` is being used as a stand-in for qfloat16 or IEEE-754 half-precision floats, it's not (immediately) obvious that this code will only ever be used for _16-bit_ HVX values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
