palday commented on code in PR #582:
URL: https://github.com/apache/arrow-julia/pull/582#discussion_r2665726882
##########
src/arraytypes/dictencoding.jl:
##########
@@ -142,55 +142,58 @@ function arrowvector(
kw...,
)
id = x.encoding.id
+ # XXX This is a race condition if two workers hit this block at the same
time, then they'll create
+ # distinct locks
if !haskey(de, id)
de[id] = Lockable(x.encoding)
- else
- encodinglockable = de[id]
- Base.@lock encodinglockable begin
- encoding = encodinglockable.value
- # in this case, we just need to check if any values in our local
pool need to be delta dicationary serialized
- deltas = setdiff(x.encoding, encoding)
- if !isempty(deltas)
- ET = indextype(encoding)
- if length(deltas) + length(encoding) > typemax(ET)
- error(
- "fatal error serializing dict encoded column with ref
index type of $ET; subsequent record batch unique values resulted in
$(length(deltas) + length(encoding)) unique values, which exceeds possible
index values in $ET",
- )
- end
- data = arrowvector(
- deltas,
- i,
- nl,
- fi,
- de,
- ded,
- nothing;
- dictencode=dictencodenested,
- dictencodenested=dictencodenested,
- dictencoding=true,
- kw...,
+ return x
Review Comment:
changed this to an early return to make the overall logic a little clearer /
reduce nesting of the `else` branch.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]