freakyzoidberg opened a new issue, #37841:
URL: https://github.com/apache/arrow/issues/37841
### Describe the bug, including details regarding any error messages,
version, and platform.
I am trying to decode in Java records generated in Go (simple type +
dictionaries) using ZSTD compression
Although this is working fine for the simple types, I am getting this error
when decoding dictionaries
```
java.lang.IllegalArgumentException: Please add arrow-compression module to
use CommonsCompressionFactory for ZSTD
at
org.apache.arrow.vector.compression.NoCompressionCodec$Factory.createCodec(NoCompressionCodec.java:69)
at org.apache.arrow.vector.VectorLoader.load(VectorLoader.java:82)
at org.apache.arrow.vector.ipc.ArrowReader.load(ArrowReader.java:256)
at
org.apache.arrow.vector.ipc.ArrowReader.loadDictionary(ArrowReader.java:247)
at
org.apache.arrow.vector.ipc.ArrowStreamReader.loadNextBatch(ArrowStreamReader.java:167)
```
The Go part is essentially
```go
dtyp := &arrow.DictionaryType{
IndexType: arrow.PrimitiveTypes.Int8,
ValueType: arrow.BinaryTypes.LargeString,
}
bldrDictString := arrowarray.NewDictionaryBuilder(memory.DefaultAllocator,
dtyp)
defer bldrDictString.Release()
bldrDictString.(*arrowarray.BinaryDictionaryBuilder).AppendString("foo")
columnTypes := make([]arrow.Field, 0, 1)
columnArrays := make([]arrow.Array, 0, 1)
columnArrays = append(columnArrays, bldrDictString.NewArray())
columnTypes = append(columnTypes, arrow.Field{Name: k.key, Type: dtyp,
Nullable: nulls.Any()})
schema := arrow.NewSchema(columnTypes, nil)
rec := arrowarray.NewRecord(schema, columnArrays, int64(size))
var buf bytes.Buffer
writer := ipc.NewWriter(&buf, ipc.WithSchema(schema), ipc.WithZstd())
err := writer.Write(rec)
err = writer.Close()
```
And the Java side
```java
import org.apache.arrow.compression.CommonsCompressionFactory;
try (ArrowStreamReader reader =
new ArrowStreamReader(
new ByteArrayInputStream(format.getArrow().toByteArray()),
bufferAllocator,
CommonsCompressionFactory.INSTANCE)) {
reader.loadNextBatch();
...
} catch (IOException e) {
throw new RuntimeException(e);
}
```
I am able to get it to not throw by making the VectorLoader used when
loading the dictionary use the compression factory defined in the reader (it is
currently defaulting to NoCompression)
see this
[change](https://github.com/freakyzoidberg/arrow/commit/f945d2ddee9c332661d3d97084c2aedb56f7fcf5),
note I was not able to make it fail using the java arrow test.
I am probably doing something wrong, and also wondering if dictionaries are
compressed the same in go and java writers which could explain why the java
test is not failing ?
Anyhow, unless I am doing something wrong, this looks like a bug.
Thanks !
### Component(s)
Java
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]