Hi Micah,
Thanks for the comments.
By storing the run-length ends (partial sum of run-lengths), it provides
better support for random access (O(log(n)), at the expense of larger
buffer width.
Generally, I think this is a better design, so the design should be changed
as follows:
2. the data stru
> The discussion in ARROW-6206 contains some mildly offensive language
> directly at the Arrow community, like "arrow is a team that picked up
> netty derived off-heap tools naively". Excuse me?
I'm trying my best to ignore language that isn't really productive to
solving technical problems :) If
Hi Ji Liu,
Thanks for getting the conversation started. I think a few things need to
happen:
1. We need to clarify in the specification that not all dictionaries need
to be present at the beginning. I plan on creating a PR for discussion
that clarifies this point, as well as handling of non-delt
Kenta Murata created ARROW-6319:
---
Summary: [C++] Extract the core of NumericTensor::Value as
Tensor::Value
Key: ARROW-6319
URL: https://issues.apache.org/jira/browse/ARROW-6319
Project: Apache Arrow
Hi Liya Fan,
Perhaps comment on the original thread? This differs from my proposal in
terms on details of encoding. For RLE, I proposed encoding run end indices
instead of run-lengths. This allows for sublinear access to elements at
the cost of potentially larger bit-widths for the lengths.
Th
Micah Kornfield created ARROW-6318:
--
Summary: [Integration] Update integration test to use generated
binaries to ensure backwards compatibility
Key: ARROW-6318
URL: https://issues.apache.org/jira/browse/ARROW-631
Micah Kornfield created ARROW-6317:
--
Summary: [Javascript]
Key: ARROW-6317
URL: https://issues.apache.org/jira/browse/ARROW-6317
Project: Apache Arrow
Issue Type: Sub-task
Componen
Micah Kornfield created ARROW-6316:
--
Summary: [Go] Make change to ensure flatbuffer reads are aligned
Key: ARROW-6316
URL: https://issues.apache.org/jira/browse/ARROW-6316
Project: Apache Arrow
Micah Kornfield created ARROW-6315:
--
Summary: [Java] Make change to ensure flatbuffer reads are aligned
Key: ARROW-6315
URL: https://issues.apache.org/jira/browse/ARROW-6315
Project: Apache Arrow
Micah Kornfield created ARROW-6314:
--
Summary: [C++] Implement alignment to ensure flatbuffer alignemnt.
Key: ARROW-6314
URL: https://issues.apache.org/jira/browse/ARROW-6314
Project: Apache Arrow
Micah Kornfield created ARROW-6313:
--
Summary: Tracking
Key: ARROW-6313
URL: https://issues.apache.org/jira/browse/ARROW-6313
Project: Apache Arrow
Issue Type: Improvement
Reporte
Hi Wes,
Thanks for the good suggestion.
It is intended to be sent through IPC. So it should implement FieldVector,
not just ValueVector.
This can be considered a sub-item of Micah's proposal about
compression/decompression.
I will spend more time on that discussion.
Best,
Liya Fan
On Wed, Aug 2
Michael Maguire created ARROW-6312:
--
Summary: Declare required Libs.private in arrow.pc package config
Key: ARROW-6312
URL: https://issues.apache.org/jira/browse/ARROW-6312
Project: Apache Arrow
Hi,
On Mon, Aug 19, 2019 at 11:30 AM Kenta Murata wrote:
> (3) Adding SparseCSCIndex
>
I'd be interested to help with (Python) part of this SparseCSCIndex.
Iād appreciate any comments or suggestions.
>
I missed previous discussion, so this might have already been discussed,
but did we ever c
Ji Liu created ARROW-6311:
-
Summary: [Java] Make ApproxEqualsVisitor accept DiffFunction to
make it more flexible
Key: ARROW-6311
URL: https://issues.apache.org/jira/browse/ARROW-6311
Project: Apache Arrow
Wes McKinney created ARROW-6310:
---
Summary: [C++] Write 64-bit integers as strings in JSON
integration test files
Key: ARROW-6310
URL: https://issues.apache.org/jira/browse/ARROW-6310
Project: Apache Arr
Attendees:
åå
Micah Kornfield
Wes McKinney
Rok Mihevc
Antoine Pitrou
Prudhvi Porandla
Neal Richardson
Discussion:
* alignment vote: Wes and Micah discussed implementation and testing
forwards and backwards compatibility
* 0.15: Alignment issues will be "blockers"; doesn't seem there are
any othe
Hi all,
Recently when we worked on fixing a IPC related bug in both Java/C++
sides[1][2], @emkornfieldfound that the stream reader assumes that all
dictionaries are at the start of the stream which is inconsistent with spec[3]
which says as long as a record batch doesn't reference a dictionar
hi Micah,
I agree that documenting the maturity of components is a good idea.
The discussion in ARROW-6206 contains some mildly offensive language
directly at the Arrow community, like "arrow is a team that picked up
netty derived off-heap tools naively". Excuse me? Documentation aside,
I think s
Antoine Pitrou created ARROW-6309:
-
Summary: [C++] Parquet tests are linked statically
Key: ARROW-6309
URL: https://issues.apache.org/jira/browse/ARROW-6309
Project: Apache Arrow
Issue Type:
Ji Liu created ARROW-6308:
-
Summary: [Java] Support write interleaved dictionaries and batches
in IPC stream
Key: ARROW-6308
URL: https://issues.apache.org/jira/browse/ARROW-6308
Project: Apache Arrow
hi Liya,
Do you intend to be able to send RLE vectors using the IPC protocol?
If so, we need to spend some time on Micah's discussion about
sparseness and encodings/compression.
- Wes
On Wed, Aug 21, 2019 at 7:33 AM Fan Liya wrote:
>
> Dear all,
>
> RLE (run length encoding) is a widely used en
Liya Fan created ARROW-6307:
---
Summary: [Java] Provide RLE vector
Key: ARROW-6307
URL: https://issues.apache.org/jira/browse/ARROW-6307
Project: Apache Arrow
Issue Type: New Feature
Compon
Dear all,
RLE (run length encoding) is a widely used encoding/decoding technique.
Compared with other encoding/decoding techniques, it is easier to work with
the encoded data.
We want to provide an RLE vector implementation in Arrow. The design
details include:
1. RleVector implements ValueVecto
Liya Fan created ARROW-6306:
---
Summary: [Java] Support stable sort by stable comparators
Key: ARROW-6306
URL: https://issues.apache.org/jira/browse/ARROW-6306
Project: Apache Arrow
Issue Type: New F
Joris Van den Bossche created ARROW-6305:
Summary: [Python] scalar pd.NaT incorrectly parsed in conversion
from Python
Key: ARROW-6305
URL: https://issues.apache.org/jira/browse/ARROW-6305
Pro
A recent issue with the JDBC adapter [1] made me realize we aren't doing
enough to communicate to consumers the maturity of various modules within
arrow. From the issue, it also seems like it is surprising that everything
is based off of off-heap data access.
To help with this I added a descripti
27 matches
Mail list logo