Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/23893 )
Change subject: KUDU-3736 add CodegenTest.CodegenRandomSchemas scenario ...................................................................... Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/23893/7/src/kudu/codegen/codegen-test.cc File src/kudu/codegen/codegen-test.cc: http://gerrit.cloudera.org:8080/#/c/23893/7/src/kudu/codegen/codegen-test.cc@688 PS7, Line 688: const size_t num_columns = 1 + gen() % cs_library.size(); : VLOG(1) << StringPrintf("thread %2zd: %2zd-column schema", : thread_idx, num_columns); : vector<size_t> idx_seq; : idx_seq.reserve(num_columns); : for (size_t i = 0; i < num_columns; ++i) { : idx_seq.push_back(gen() % cs_library.size()); : } : : // Create a schema with the given number of columns, picking columns : // from the 'column schema library' in random order. : unordered_set<size_t> seen_idx; : seen_idx.emplace(0); // the 'key' column is always present : SchemaBuilder sb; : CHECK_OK(sb.AddKeyColumn(cs_library.front())); : for (auto idx : idx_seq) { : if (!seen_idx.emplace(idx).second) { : // A randomly chosen index is duplicate, ignore and continue. : continue; : } : CHECK_OK(sb.AddColumn(cs_library[idx])); : } > std::sample does exactly this (no repetition of the same element). That's a good point, done. http://gerrit.cloudera.org:8080/#/c/23893/7/src/kudu/codegen/codegen-test.cc@716 PS7, Line 716: vector<size_t> col_indices(gen() % seen_idx.size()); : std::iota(col_indices.begin(), col_indices.end(), 0); : std::shuffle(col_indices.begin(), col_indices.end(), gen); : : VLOG(3) << StringPrintf("thread %2zd: %2zd-column projection", : thread_idx, col_indices.size()); : : // Convert the indices of columns in the 'column schema library' : // into internal column identifiers of this current schema. : vector<ColumnId> col_ids; : col_ids.reserve(col_indices.size()); : for (size_t col_idx : col_indices) { : col_ids.push_back(schema.column_id(col_idx)); : } : // Create a projection with columns at the specified indices. : CHECK_OK(schema.CreateProjectionByIdsIgnoreMissing(col_ids, &projection)); : } : : // Request code generation. > Could be much simpler with std::sample. Done -- To view, visit http://gerrit.cloudera.org:8080/23893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic51e8fec02f74ecc11fa740f05ffeb9a7f41d8d9 Gerrit-Change-Number: 23893 Gerrit-PatchSet: 7 Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Ashwani Raina <[email protected]> Gerrit-Reviewer: Attila Bukor <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Martonka <[email protected]> Gerrit-Comment-Date: Fri, 06 Feb 2026 07:21:17 +0000 Gerrit-HasComments: Yes
