Sutou Kouhei created ARROW-6196:
---
Summary: [Ruby] Add support for building Arrow::TimeNNArray by .new
Key: ARROW-6196
URL: https://issues.apache.org/jira/browse/ARROW-6196
Project: Apache Arrow
Hi Jacques,
What avenue were you thinking for supporting both paths? I didn't want to
pursue a different class hierarchy, because I felt like that would
effectively fork the code base, but that is potentially an option that
would allow us to have a complete reference implementation in Java that
Reading data from two different parquet files sequentially with different
dictionaries for the same column. This could be handled by re-encoding
data but that seems potentially sub-optimal.
On Sat, Aug 10, 2019 at 12:38 PM Jacques Nadeau wrote:
> What situation are anticipating where you're
Ji Liu created ARROW-6194:
-
Summary: [Java] Make DictionaryEncoder non-static making it easy
to extend and reuse
Key: ARROW-6194
URL: https://issues.apache.org/jira/browse/ARROW-6194
Project: Apache Arrow
What situation are anticipating where you're going to be restating ids mid
stream?
On Sat, Aug 10, 2019 at 12:13 AM Micah Kornfield
wrote:
> The IPC specification [1] defines behavior when isDelta on a
> DictionaryBatch [2] is "true". I might have missed it in the
> specification, but I
Sutou Kouhei created ARROW-6197:
---
Summary: [GLib] Add garrow_decimal128_rescale()
Key: ARROW-6197
URL: https://issues.apache.org/jira/browse/ARROW-6197
Project: Apache Arrow
Issue Type: New
I should add that Option #1 above would be my preference, even though it
adds some complications (especially for the file format).
On Sat, Aug 10, 2019 at 12:12 AM Micah Kornfield
wrote:
> The IPC specification [1] defines behavior when isDelta on a
> DictionaryBatch [2] is "true". I might
Omer Ozarslan created ARROW-6195:
Summary: [C++] CMake fails with file not found error while
bundling thrift if python is not installed
Key: ARROW-6195
URL: https://issues.apache.org/jira/browse/ARROW-6195
This is a pretty massive change to the apis. I wonder how nasty it would be
to just support both paths. Have you evaluated how complex that would be?
On Wed, Aug 7, 2019 at 11:08 PM Micah Kornfield
wrote:
> After more investigation, it looks like Float8Benchmarks at least on my
> machine are
The IPC specification [1] defines behavior when isDelta on a
DictionaryBatch [2] is "true". I might have missed it in the
specification, but I couldn't find the interpretation for what the expected
behavior is when isDelta=false and and two dictionary batches with the
same ID are sent.
It
Hi, Jacques, thanks for your valuable feedback.
Sorry for the lack of discuss. Some of these PRs are small change/bugfix which
not deserving a discuss. You are right, some PRs are more complex than we
thought before in the review process, making a discuss on ML/JIRA would
actually help. This
Hey Micah, I didn't have a particular path in mind. Was thinking more along
the lines of extra methods as opposed to separate classes.
Arrow hasn't historically been a place where we're writing algorithms in
Java so the fact that they aren't there doesn't mean they don't exist. We
have a large
Hi, all
While working on the issue to implement dictionary-encoded subfields[1] [2], I
found FixedSizeListVector not extends ListVector(Thanks Micah pointing this out
and curious why implemented FixedSizeListVector this way
before). Since FixedSizeListVector is a specific case of ListVector,
Hi Ji Liu,
I think have a common interface/base-class for the two makes sense (but
don't have historical context) from a reading data perspective.
I think the change would need to be something above
BaseRepeatedValueVector, since the FixedSizeListVector doesn't contain an
offset buffer, and that
Hi Jacques,
I definitely understand these concerns and this change is risky because it
is so large. Perhaps, creating a new hierarchy, might be the cleanest way
of dealing with this. This could have other benefits like cleaning up some
cruft around dictionary encode and "orphaned" method. Per
Hi Micah, thanks for your suggestion.
You are right, the mainly difference between FixSizedListVector and ListVector
is the offsetBuffer, but I think this could be avoided through
allocateNewSafe() overwrite which calls allocateOffsetBuffer() in
BaseRepeatedValueVector, in this way,
16 matches
Mail list logo