jrmccluskey commented on code in PR #39011: URL: https://github.com/apache/beam/pull/39011#discussion_r3500276169
########## sdks/python/apache_beam/testing/benchmarks/cloudml/criteo_tft/criteo.py: ########## @@ -38,6 +38,23 @@ def get_transformed_categorical_column_name(column_name_or_id): return column_name + '_id' +def fill_in_missing(feature, default_value): + """Fills missing values in a rank 2 SparseTensor. + + Args: + feature: A rank 2 SparseTensor with at most one value per row. + default_value: The value to fill in for missing entries. + + Returns: + A rank 1 Tensor with missing entries filled in. + """ + feature = tf.sparse.to_dense( + tf.SparseTensor( + feature.indices, feature.values, [feature.dense_shape[0], 1]), + default_value=default_value) + return tf.squeeze(feature, axis=1) Review Comment: +1, if the intent is to slide that boilerplate into a helper for clarity it's probably best to just move it verbatim -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
