cyb70289 commented on a change in pull request #11864: URL: https://github.com/apache/arrow/pull/11864#discussion_r766338813
########## File path: cpp/src/arrow/compute/kernels/random.cc ########## @@ -0,0 +1,63 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include <memory> +#include <random> + +#include "arrow/compute/kernel.h" +#include "arrow/compute/kernels/common.h" +#include "arrow/compute/registry.h" +#include "arrow/util/pcg_random.h" + +namespace arrow { +namespace compute { +namespace internal { +namespace { + +// Generates a random floating point number in range [0, 1). +template <class Rng> +double generate_uniform(Rng& rng) { + double range = static_cast<double>(Rng::max() - Rng::min()) + 1; + double value = static_cast<double>(rng() - Rng::min()); + return value / range; +} + +Status ExecRandom(KernelContext* ctx, const ExecBatch& batch, Datum* out) { + static thread_local std::random_device rd; + static thread_local random::pcg64 gen(rd()); Review comment: Options are provided once for each call. There's no state transition. One possible implementation is to provide two options. One option (bool) to select using random seed or preset seed, and another option (int) for the preset seed. If the first option is true(random seed), we will use std::random_device to generate the initial seed, and ignore the second option. If the first options is false(preset seed), then we will use the second option as the initial seed. @pitrou please correct me if I'm wrong. Looks current implementation generates only one scalar per call, so a seed may be a bit awkward to use. You have to choose a different seed for each call to get a different output. The common use case is to generate a random array, it's desirable to provide a way to get same output for all calls on all platforms if we provide the same initial seed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
