caiwanli opened a new issue, #41591:
URL: https://github.com/apache/arrow/issues/41591
### Describe the usage question you have. Please include as many useful
details as possible.
I want to sort a RecordBatch, and I noticed the SortOptions class in
compute, so I wrote the following code:
`// Table [id, name, age, salary, height]
// Create a couple 32-bit integer arrays.
arrow::Int32Builder int32builder;
int32_t ids[10] = {0,1,2,3,4,5,6,7,8,9};
ARROW_RETURN_NOT_OK(int32builder.AppendValues(ids, 10));
std::shared_ptr<arrow::Array> id;
ARROW_ASSIGN_OR_RAISE(id, int32builder.Finish());
int32_t ages[10] = {28,34,25,26,24,23,25,23,30,22};
ARROW_RETURN_NOT_OK(int32builder.AppendValues(ages, 10));
std::shared_ptr<arrow::Array> age;
ARROW_ASSIGN_OR_RAISE(age, int32builder.Finish());
int32_t salarys[10] =
{20000,38000,23000,22000,5000,8000,9000,18000,45000,4000};
ARROW_RETURN_NOT_OK(int32builder.AppendValues(salarys, 10));
std::shared_ptr<arrow::Array> salary;
ARROW_ASSIGN_OR_RAISE(salary, int32builder.Finish());
int32_t heights[10] = {165,178,183,180,177,174,176,173,168,171};
ARROW_RETURN_NOT_OK(int32builder.AppendValues(heights, 10));
std::shared_ptr<arrow::Array> height;
ARROW_ASSIGN_OR_RAISE(height, int32builder.Finish());
arrow::StringBuilder utf8builder;
std::vector<std::string> names =
{"cwl_0","hhq_1","lsc_2","zyn_3","yyj_4","zyy_5","chj_6","hcf_7","wk_8","zw_9"};
ARROW_RETURN_NOT_OK(utf8builder.AppendValues(names));
std::shared_ptr<arrow::Array> name;
ARROW_ASSIGN_OR_RAISE(name, utf8builder.Finish());
// Make a table out of our pair of arrays.
// [id, name, age, salary, height]
std::shared_ptr<arrow::Field> field_id, field_name, field_age,
field_salary, field_height;
std::shared_ptr<arrow::Schema> schema;
field_id = arrow::field("ID", arrow::int32());
field_name = arrow::field("Name", arrow::utf8());
field_age = arrow::field("Age", arrow::int32());
field_salary = arrow::field("Salary", arrow::int32());
field_height = arrow::field("Height", arrow::int32());
schema = arrow::schema({field_id, field_name, field_age, field_salary,
field_height});
std::shared_ptr<arrow::RecordBatch> rbatch;
rbatch = arrow::RecordBatch::Make(schema, 10, {id, name, age, salary,
height});
std::cout << "===================Sort====================" << std::endl;
arrow::Datum res;
std::vector<arrow::compute::SortKey> sort_keys;
arrow::compute::SortKey age_key("Age",
arrow::compute::SortOrder::Ascending);
arrow::compute::SortKey salary_key("Salary",
arrow::compute::SortOrder::Ascending);
arrow::compute::SortKey name_key("Name",
arrow::compute::SortOrder::Descending);
sort_keys.emplace_back(age_key);
sort_keys.emplace_back(salary_key);
sort_keys.emplace_back(name_key);
// arrow::compute::FilterOptions filter_options;
// ARROW_ASSIGN_OR_RAISE(
// four_item, arrow::compute::CallFunction("filter", {rbatch1,
filter_arr}, &filter_options));
arrow::compute::SortOptions sort_options(std::move(sort_keys));
ARROW_ASSIGN_OR_RAISE(res, arrow::compute::CallFunction("order",
{rbatch}, &sort_options));
rbatch = res.record_batch();
std::cout << "Datum kind: " << rbatch->ToString() << "++++" <<
rbatch->num_rows()<< std::endl;
return arrow::Status::OK();`
But here I'm not sure if the usage is correct, and I don't know what
function name to pass to CallFunction. I didn't find any sorting-related
instructions in the official documentation. How should I implement sorting for
RecordBatch?
### Component(s)
C++
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]