Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23893 )

Change subject: KUDU-3736 add CodegenTest.CodegenRandomSchemas scenario
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/23893/7/src/kudu/codegen/codegen-test.cc
File src/kudu/codegen/codegen-test.cc:

http://gerrit.cloudera.org:8080/#/c/23893/7/src/kudu/codegen/codegen-test.cc@688
PS7, Line 688:         const size_t num_columns = 1 + gen() % cs_library.size();
             :         VLOG(1) << StringPrintf("thread %2zd: %2zd-column 
schema",
             :                                   thread_idx, num_columns);
             :         vector<size_t> idx_seq;
             :         idx_seq.reserve(num_columns);
             :         for (size_t i = 0; i < num_columns; ++i) {
             :           idx_seq.push_back(gen() % cs_library.size());
             :         }
             :
             :         // Create a schema with the given number of columns, 
picking columns
             :         // from the 'column schema library' in random order.
             :         unordered_set<size_t> seen_idx;
             :         seen_idx.emplace(0);  // the 'key' column is always 
present
             :         SchemaBuilder sb;
             :         CHECK_OK(sb.AddKeyColumn(cs_library.front()));
             :         for (auto idx : idx_seq) {
             :           if (!seen_idx.emplace(idx).second) {
             :             // A randomly chosen index is duplicate, ignore and 
continue.
             :             continue;
             :           }
             :           CHECK_OK(sb.AddColumn(cs_library[idx]));
             :         }
> std::sample does exactly this (no repetition of the same element).
That's a good point, done.


http://gerrit.cloudera.org:8080/#/c/23893/7/src/kudu/codegen/codegen-test.cc@716
PS7, Line 716:            vector<size_t> col_indices(gen() % seen_idx.size());
             :             std::iota(col_indices.begin(), col_indices.end(), 0);
             :             std::shuffle(col_indices.begin(), col_indices.end(), 
gen);
             :
             :             VLOG(3) << StringPrintf("thread %2zd: %2zd-column 
projection",
             :                                     thread_idx, 
col_indices.size());
             :
             :             // Convert the indices of columns in the 'column 
schema library'
             :             // into internal column identifiers of this current 
schema.
             :             vector<ColumnId> col_ids;
             :             col_ids.reserve(col_indices.size());
             :             for (size_t col_idx : col_indices) {
             :               col_ids.push_back(schema.column_id(col_idx));
             :             }
             :             // Create a projection with columns at the specified 
indices.
             :             
CHECK_OK(schema.CreateProjectionByIdsIgnoreMissing(col_ids, &projection));
             :           }
             :
             :           // Request code generation.
> Could be much simpler with std::sample.
Done



--
To view, visit http://gerrit.cloudera.org:8080/23893
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic51e8fec02f74ecc11fa740f05ffeb9a7f41d8d9
Gerrit-Change-Number: 23893
Gerrit-PatchSet: 7
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka <[email protected]>
Gerrit-Comment-Date: Fri, 06 Feb 2026 07:21:17 +0000
Gerrit-HasComments: Yes

Reply via email to