ewjoachim opened a new issue, #39309:
URL: https://github.com/apache/arrow/issues/39309
### Describe the bug, including details regarding any error messages,
version, and platform.
It's been quite hard to understand how to write code for using arrow in go,
but I believe I've managed to get it working. One of the thing that still
stumbles me is the following scenario (I did my best to make it minimal, so
it's a bit strange, but as long as you don't think I'm doing something that
should be "forbidden" altogether, then consider it's the short form of my use
case).
```go
package main
import (
"os"
"github.com/apache/arrow/go/v14/arrow"
"github.com/apache/arrow/go/v14/arrow/array"
"github.com/apache/arrow/go/v14/arrow/memory"
"github.com/apache/arrow/go/v14/parquet"
"github.com/apache/arrow/go/v14/parquet/pqarrow"
)
func main() {
schema := arrow.NewSchema(
[]arrow.Field{
{Name: "ts", Type:
arrow.ListOf(arrow.PrimitiveTypes.Uint64)},
}, nil)
builder := array.NewRecordBuilder(memory.DefaultAllocator, schema)
listBuilder := builder.Field(0).(*array.ListBuilder)
listBuilder.Append(true)
arrowRec := builder.NewRecord()
f, err := os.CreateTemp(".", "test.parquet")
if err != nil {
panic(err)
}
fileWriter, err := pqarrow.NewFileWriter(
arrowRec.Schema(),
f,
parquet.NewWriterProperties(
parquet.WithDictionaryFor("ts.list.element", false),
parquet.WithEncodingFor("ts.list.element",
parquet.Encodings.DeltaBinaryPacked),
),
pqarrow.DefaultWriterProps(),
)
parquetWriter := fileWriter
if err != nil {
panic(err)
}
parquetWriter.WriteBuffered(arrowRec)
arrowRec.Release()
}
```
So in human words: I'm creating an Arrow schema containing a list of uints.
Then I write a single empty list. I indicate that parquet should write the
uints as `DeltaBinaryPacked`. Finally, I write the parquet.
Without the Encoding part, this works, but when I specify
`DeltaBinaryPacked`, I get a panic.
Failure is in BitWriter.Written, but the BitWriter pointer is nil.
```
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xc6a3fc]
goroutine 1 [running]:
github.com/apache/arrow/go/v14/parquet/internal/utils.(*BitWriter).Written(0x0)
/home/joachim/go/pkg/mod/github.com/apache/arrow/go/[email protected]/parquet/internal/utils/bit_writer.go:109
+0x1c
github.com/apache/arrow/go/v14/parquet/internal/encoding.(*deltaBitPackEncoder).EstimatedDataEncodedSize(0xc0002be900)
/home/joachim/go/pkg/mod/github.com/apache/arrow/go/[email protected]/parquet/internal/encoding/delta_bit_packing.go:461
+0x2c
github.com/apache/arrow/go/v14/parquet/file.(*columnWriter).commitWriteAndCheckPageLimit(0xc0003d8000,
0x1, 0x0)
/home/joachim/go/pkg/mod/github.com/apache/arrow/go/[email protected]/parquet/file/column_writer.go:255
+0x86
github.com/apache/arrow/go/v14/parquet/file.(*Int64ColumnChunkWriter).WriteBatchSpaced.func1(0x0,
0x1)
/home/joachim/go/pkg/mod/github.com/apache/arrow/go/[email protected]/parquet/file/column_writer_types.gen.go:350
+0x412
github.com/apache/arrow/go/v14/parquet/file.doBatches(0x1, 0x400,
0xc0002d00b0)
/home/joachim/go/pkg/mod/github.com/apache/arrow/go/[email protected]/parquet/file/column_writer.go:631
+0xf5
github.com/apache/arrow/go/v14/parquet/file.(*Int64ColumnChunkWriter).WriteBatchSpaced(0xc0003d8000,
{0x0, 0x0, 0x0}, {0xc00036c3c0, 0x1, 0x80}, {0xc00036c500, 0x1, 0x80}, ...)
/home/joachim/go/pkg/mod/github.com/apache/arrow/go/[email protected]/parquet/file/column_writer_types.gen.go:336
+0x247
github.com/apache/arrow/go/v14/parquet/pqarrow.writeDenseArrow(0xc000346730,
{0x17c89b0, 0xc0003d8000}, {0x17c57a0, 0xc00033ba80}, {0xc00036c3c0, 0x1,
0x80}, {0xc00036c500, 0x1, ...}, ...)
/home/joachim/go/pkg/mod/github.com/apache/arrow/go/[email protected]/parquet/pqarrow/encode_arrow.go:436
+0x57a9
github.com/apache/arrow/go/v14/parquet/pqarrow.WriteArrowToColumn({0x17b4180,
0xc00037a870}, {0x17c89b0, 0xc0003d8000}, {0x17c57a0, 0xc00033ba80},
{0xc00036c3c0, 0x1, 0x80}, {0xc00036c500, ...}, ...)
/home/joachim/go/pkg/mod/github.com/apache/arrow/go/[email protected]/parquet/pqarrow/encode_arrow.go:232
+0x3cf
github.com/apache/arrow/go/v14/parquet/pqarrow.(*ArrowColumnWriter).Write(0xc0002d1860,
{0x17b4180, 0xc00037a870})
/home/joachim/go/pkg/mod/github.com/apache/arrow/go/[email protected]/parquet/pqarrow/encode_arrow.go:193
+0x832
github.com/apache/arrow/go/v14/parquet/pqarrow.(*FileWriter).WriteColumnChunked(0xc0003d4000,
0xc00033ba00, 0x0, 0x1)
/home/joachim/go/pkg/mod/github.com/apache/arrow/go/[email protected]/parquet/pqarrow/file_writer.go:313
+0x27a
github.com/apache/arrow/go/v14/parquet/pqarrow.(*FileWriter).WriteColumnData(0xc0003d4000,
{0x17c5aa0, 0xc0003466e0})
/home/joachim/go/pkg/mod/github.com/apache/arrow/go/[email protected]/parquet/pqarrow/file_writer.go:322
+0x17d
github.com/apache/arrow/go/v14/parquet/pqarrow.(*FileWriter).WriteBuffered(0xc0003d4000,
{0x17c3ae8, 0xc00037a4e0})
/home/joachim/go/pkg/mod/github.com/apache/arrow/go/[email protected]/parquet/pqarrow/file_writer.go:186
+0x83b
main.main()
/home/joachim/src/goplayground/play/main.go:66 +0x472
```
Runs: On v14, supposedly on master, but I new enough in go that I'm not
really sure it's really master and not the latest stable v14 tag, `go version
go1.21.4 linux/amd64`, `ubuntu 22.04 LTS`
### Component(s)
Go, Parquet
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]