[ https://issues.apache.org/jira/browse/ARROW-18274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Laurent Querel updated ARROW-18274: ----------------------------------- Description: Union of structs is currently buggy in V10. See the following example. {code:go} dt1 := arrow.SparseUnionOf([]arrow.Field{ {Name: "c", Type: &arrow.DictionaryType { IndexType: arrow.PrimitiveTypes.Uint16, ValueType: arrow.BinaryTypes.String, Ordered: false, }} , }, []arrow.UnionTypeCode{0}) dt2 := arrow.SparseUnionOf([]arrow.Field { \{Name: "a", Type: dt1} , }, []arrow.UnionTypeCode{0}) pool := memory.NewGoAllocator() array := array.NewSparseUnionBuilder(pool, dt2) {code} The created array is unusable because the memo table of the dictionary builder (field 'c') is nil. When I replace the struct by a second union (so 2 nested union), the dictionary builder is properly initialized. *First analysis:* - The `NewSparseUnionBuilder` calls the builders for each variant and also calls defer builder.Release. - The Struct Release method calls the Release methods of every field even if the refCount is not 0, so the Release method of the second union is called followed by the Release method of the dictionary. This bug doesn't happen with 2 nested unions as the internal counter is properly tested. In the first place I don't understand why the Release method of each variant is call just after the creation of the Union builder. I also don't understand why the Release method of the Struct calls the Release method of each field independently of the value of the internal refCount. Any idea? was: Union of structs is currently buggy in V10. See the following example. {code:go} dt1 := arrow.SparseUnionOf([]arrow.Field{ {Name: "c", Type: &arrow.DictionaryType { IndexType: arrow.PrimitiveTypes.Uint16, ValueType: arrow.BinaryTypes.String, Ordered: false, }} , }, []arrow.UnionTypeCode{0}) dt2 := arrow.SparseUnionOf([]arrow.Field { \{Name: "a", Type: dt1} , }, []arrow.UnionTypeCode{0}) pool := memory.NewGoAllocator() array := array.NewSparseUnionBuilder(pool, dt2) {code} The created array is unusable because the memo table of the dictionary builder (field 'c') is nil. When I replace the struct by a second union (so 2 nested union), the dictionary builder is properly initialized. First analysis: - The `NewSparseUnionBuilder` calls the builders for each variant and also calls defer builder.Release. - The Struct Release method calls the Release methods of every field even if the internal counter is not 0, so the Release method of the second union is called followed by the Release method of the dictionary. This bug doesn't happen with 2 nested unions as the internal counter is properly tested. In the first place I don't understand why the Release method of each variant is call just after the creation of the Union builder. I also don't understand why the Release method of the Struct calls the Release method of each field independently of the value of the internal counter. Any idea? > [Go] Sparse union of structs is buggy > ------------------------------------- > > Key: ARROW-18274 > URL: https://issues.apache.org/jira/browse/ARROW-18274 > Project: Apache Arrow > Issue Type: Bug > Components: Go > Affects Versions: 10.0.0 > Reporter: Laurent Querel > Priority: Major > > Union of structs is currently buggy in V10. See the following example. > > {code:go} > dt1 := arrow.SparseUnionOf([]arrow.Field{ > {Name: "c", Type: &arrow.DictionaryType > { IndexType: arrow.PrimitiveTypes.Uint16, ValueType: > arrow.BinaryTypes.String, Ordered: false, }} > , > }, []arrow.UnionTypeCode{0}) > dt2 := arrow.SparseUnionOf([]arrow.Field > { \{Name: "a", Type: dt1} > , > }, []arrow.UnionTypeCode{0}) > pool := memory.NewGoAllocator() > array := array.NewSparseUnionBuilder(pool, dt2) {code} > > The created array is unusable because the memo table of the dictionary > builder (field 'c') is nil. > When I replace the struct by a second union (so 2 nested union), the > dictionary builder is properly initialized. > > *First analysis:* > - The `NewSparseUnionBuilder` calls the builders for each variant and also > calls defer builder.Release. > - The Struct Release method calls the Release methods of every field even if > the refCount is not 0, so the Release method of the second union is called > followed by the Release method of the dictionary. > > This bug doesn't happen with 2 nested unions as the internal counter is > properly tested. > > In the first place I don't understand why the Release method of each variant > is call just after the creation of the Union builder. I also don't understand > why the Release method of the Struct calls the Release method of each field > independently of the value of the internal refCount. > > Any idea? -- This message was sent by Atlassian Jira (v8.20.10#820010)