This is an automated email from the ASF dual-hosted git repository.
zeroshade pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new fc48b8963d GH-40261: [Go] Don't export array functions with unexposed
return types (#40272)
fc48b8963d is described below
commit fc48b8963d6486ac129a7c1365a35d02b28876e8
Author: Gabriel Tomitsuka <[email protected]>
AuthorDate: Thu Feb 29 19:00:33 2024 +0100
GH-40261: [Go] Don't export array functions with unexposed return types
(#40272)
### Rationale for this change
Exposing functions that return unexposed types in Go is considered poor
practice. This approach complicates type handling, making it challenging for
developers to utilize these return values in their functions. Developers must
undertake the cumbersome process of identifying the applicable interface for
the return type, a task that often results in significant time consumption and
leads to confusing, non-informative types being suggested by godocs and IDEs.
Consider the difficulty in discerning the relationship between two return
types, `*simpleTable` and `arrow.Table`, at a glance. It is not immediately
clear whether they implement the same interface or are distinct entities:
<img width="1161" alt="image"
src="https://github.com/apache/arrow/assets/10295671/463cd8a7-47f3-44ce-9871-2885025e5a5c">
<img width="1151" alt="image"
src="https://github.com/apache/arrow/assets/10295671/4ffc049c-fb88-43fb-bd57-fc1ad5d4dc68">
Returning exposed interfaces is already commonly done in the Arrow package
to ensure API consistency and usability, as evidenced in methods like
`AddColumn() -> arrow.Table` and `RecordFromJSON() -> arrow.Record`. Extending
this to all functions, including `NewTable`, `NewTableFromSlice`, and
`NewRecord`, will standardize the codebase in line with these principles.
The use of `*simpleTable` and similar types is restricted in explicit type
declarations and function signatures. Therefore, transitioning to exposed
return types is a backward-compatible improvement that will lead to enhanced
documentation and better support in IDEs for Arrow users.
### What changes are included in this PR?
* Change return signature of functions using the following unexposed return
types:
* `*simpleTable` --> `arrow.Table`
* `*simpleRecord` --> `arrow.Record`
* `*simpleRecords` --> `array.RecordReader`
* Add the function `String()`, which is implemented by `*simpleTable`, to
the `arrow.Table` interface. `*simpleTable` is the only implementation of
`arrow.Table`, so this requires no further changes.
### Are these changes tested?
Yes. The relevant code is already covered by tests in
`arrow/array/table_test.go` (`TestTable`) and `arrow/array/record_test.go`
(`TestRecord`, `TestRecordReader`).
All tests pass (subpackages without tests omitted):
```bash
ok github.com/apache/arrow/go/v16/arrow 0.398s
ok github.com/apache/arrow/go/v16/arrow/array 0.600s
ok github.com/apache/arrow/go/v16/arrow/arrio 1.544s
ok github.com/apache/arrow/go/v16/arrow/avro 0.629s
ok github.com/apache/arrow/go/v16/arrow/bitutil 1.001s
ok github.com/apache/arrow/go/v16/arrow/compute 2.147s
ok github.com/apache/arrow/go/v16/arrow/compute/exec 0.813s
ok github.com/apache/arrow/go/v16/arrow/compute/exprs 1.900s
ok github.com/apache/arrow/go/v16/arrow/csv 0.288s
ok github.com/apache/arrow/go/v16/arrow/decimal128 1.356s
ok github.com/apache/arrow/go/v16/arrow/decimal256 1.718s
ok github.com/apache/arrow/go/v16/arrow/encoded 0.493s
ok github.com/apache/arrow/go/v16/arrow/flight 2.845s
ok github.com/apache/arrow/go/v16/arrow/flight/flightsql 0.512s
ok github.com/apache/arrow/go/v16/arrow/flight/flightsql/driver
7.386s
ok github.com/apache/arrow/go/v16/arrow/float16 0.570s
ok github.com/apache/arrow/go/v16/arrow/internal/arrjson 0.419s
ok github.com/apache/arrow/go/v16/arrow/internal/dictutils 0.407s
ok github.com/apache/arrow/go/v16/arrow/internal/testing/tools
0.247s
ok github.com/apache/arrow/go/v16/arrow/ipc 1.984s
ok github.com/apache/arrow/go/v16/arrow/ipc/cmd/arrow-cat 0.530s
ok github.com/apache/arrow/go/v16/arrow/ipc/cmd/arrow-file-to-stream
1.267s
ok
github.com/apache/arrow/go/v16/arrow/ipc/cmd/arrow-json-integration-test
1.074s
ok github.com/apache/arrow/go/v16/arrow/ipc/cmd/arrow-ls 1.263s
ok github.com/apache/arrow/go/v16/arrow/ipc/cmd/arrow-stream-to-file
0.935s
ok github.com/apache/arrow/go/v16/arrow/math 0.616s
ok github.com/apache/arrow/go/v16/arrow/memory 1.275s
ok github.com/apache/arrow/go/v16/arrow/memory/mallocator 0.348s
ok github.com/apache/arrow/go/v16/arrow/scalar 0.484s
ok github.com/apache/arrow/go/v16/arrow/tensor 0.418s
ok github.com/apache/arrow/go/v16/arrow/util 0.621s
```
### Are there any user-facing changes?
This update will not affect users directly since the modified interfaces
were previously unexposed and, consequently, inaccessible from user code. This
change is aimed at improving the developer experience by simplifying type usage
and enhancing documentation clarity.
* GitHub Issue: #40261
Authored-by: gtomitsuka <[email protected]>
Signed-off-by: Matt Topol <[email protected]>
---
go/arrow/array/record.go | 4 ++--
go/arrow/array/table.go | 6 +++---
go/arrow/table.go | 2 ++
3 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/go/arrow/array/record.go b/go/arrow/array/record.go
index b4a03410c4..6a45880181 100644
--- a/go/arrow/array/record.go
+++ b/go/arrow/array/record.go
@@ -50,7 +50,7 @@ type simpleRecords struct {
}
// NewRecordReader returns a simple iterator over the given slice of records.
-func NewRecordReader(schema *arrow.Schema, recs []arrow.Record)
(*simpleRecords, error) {
+func NewRecordReader(schema *arrow.Schema, recs []arrow.Record) (RecordReader,
error) {
rs := &simpleRecords{
refCount: 1,
schema: schema,
@@ -124,7 +124,7 @@ type simpleRecord struct {
//
// NewRecord panics if the columns and schema are inconsistent.
// NewRecord panics if rows is larger than the height of the columns.
-func NewRecord(schema *arrow.Schema, cols []arrow.Array, nrows int64)
*simpleRecord {
+func NewRecord(schema *arrow.Schema, cols []arrow.Array, nrows int64)
arrow.Record {
rec := &simpleRecord{
refCount: 1,
schema: schema,
diff --git a/go/arrow/array/table.go b/go/arrow/array/table.go
index 197179b5ca..2e7bb72d77 100644
--- a/go/arrow/array/table.go
+++ b/go/arrow/array/table.go
@@ -99,7 +99,7 @@ type simpleTable struct {
//
// NewTable panics if the columns and schema are inconsistent.
// NewTable panics if rows is larger than the height of the columns.
-func NewTable(schema *arrow.Schema, cols []arrow.Column, rows int64)
*simpleTable {
+func NewTable(schema *arrow.Schema, cols []arrow.Column, rows int64)
arrow.Table {
tbl := simpleTable{
refCount: 1,
rows: rows,
@@ -136,7 +136,7 @@ func NewTable(schema *arrow.Schema, cols []arrow.Column,
rows int64) *simpleTabl
// - len(schema.Fields) != len(data)
// - the total length of each column's array slice (ie: number of rows
// in the column) aren't the same for all columns.
-func NewTableFromSlice(schema *arrow.Schema, data [][]arrow.Array)
*simpleTable {
+func NewTableFromSlice(schema *arrow.Schema, data [][]arrow.Array) arrow.Table
{
if len(data) != schema.NumFields() {
panic("array/table: mismatch in number of columns and data for
creating a table")
}
@@ -175,7 +175,7 @@ func NewTableFromSlice(schema *arrow.Schema, data
[][]arrow.Array) *simpleTable
// NewTableFromRecords returns a new basic, non-lazy in-memory table.
//
// NewTableFromRecords panics if the records and schema are inconsistent.
-func NewTableFromRecords(schema *arrow.Schema, recs []arrow.Record)
*simpleTable {
+func NewTableFromRecords(schema *arrow.Schema, recs []arrow.Record)
arrow.Table {
arrs := make([]arrow.Array, len(recs))
cols := make([]arrow.Column, schema.NumFields())
diff --git a/go/arrow/table.go b/go/arrow/table.go
index f0728108d9..15fd3e5bcf 100644
--- a/go/arrow/table.go
+++ b/go/arrow/table.go
@@ -39,6 +39,8 @@ type Table interface {
Retain()
Release()
+
+ fmt.Stringer
}
// Column is an immutable column data structure consisting of