cmsxbc created ARROW-9040:
-
Summary: [Python][Parquet]"_ParquetDatasetV2" fail to read with
columns and use_pandas_metadata=True
Key: ARROW-9040
URL: https://issues.apache.org/jira/browse/ARROW-9040
Project:
--
ZJ
zhuojia@gmail.com
Yoav Git created ARROW-9039:
---
Summary: py_bytes created by pyarrow 0.11.1 cannot be deserialized
by more recent versions
Key: ARROW-9039
URL: https://issues.apache.org/jira/browse/ARROW-9039
Project:
Yibo Cai created ARROW-9038:
---
Summary: [C++] Improve BitBlockCounter
Key: ARROW-9038
URL: https://issues.apache.org/jira/browse/ARROW-9038
Project: Apache Arrow
Issue Type: Improvement
Zhuo Peng created ARROW-9037:
Summary: [C++/C-ABI] unable to import array with null count == -1
(which could be exported)
Key: ARROW-9037
URL: https://issues.apache.org/jira/browse/ARROW-9037
Project:
Gaurangi Saxena created ARROW-9036:
--
Summary: Null pointer exception when caching data frames)
Key: ARROW-9036
URL: https://issues.apache.org/jira/browse/ARROW-9036
Project: Apache Arrow
Le 04/06/2020 à 18:11, Rémi Dettai a écrit :
> > Ideally, we should be able to presize the array to a good enough
> estimate.
> You should be able to get away with a correct estimation because parquet
> column metadata contains the uncompressed size. But is their anything wrong
> with this idea
> Ideally, we should be able to presize the array to a good enough
estimate.
You should be able to get away with a correct estimation because parquet
column metadata contains the uncompressed size. But is their anything wrong
with this idea of mmaping huge "runways" for our larger allocations ?
I documented [1] the behaviors by experimentation or by reading the
documentation. My experiments were mostly about checking INT64_MAX +
1. My preference would be to use the platform defined behavior by
default and provide a safety option that errors.
Feel free to add more databases/systems.
On Thu, 4 Jun 2020 17:48:16 +0200
Rémi Dettai wrote:
> When creating large arrays, Arrow uses realloc quite intensively.
>
> I have an example where y read a gzipped parquet column (strings) that
> expands from 8MB to 100+MB when loaded into Arrow. Of course Jemalloc
> cannot anticipate this and
When creating large arrays, Arrow uses realloc quite intensively.
I have an example where y read a gzipped parquet column (strings) that
expands from 8MB to 100+MB when loaded into Arrow. Of course Jemalloc
cannot anticipate this and every reallocate call above 1MB (the most
critical ones) ends
Anthony Abate created ARROW-9035:
Summary: 8 vs 64 byte alignment
Key: ARROW-9035
URL: https://issues.apache.org/jira/browse/ARROW-9035
Project: Apache Arrow
Issue Type: Bug
Wes McKinney created ARROW-9034:
---
Summary: [C++] Implement binary (two bitmap) version of
BitBlockCounter
Key: ARROW-9034
URL: https://issues.apache.org/jira/browse/ARROW-9034
Project: Apache Arrow
On Thu, Jun 4, 2020 at 4:57 AM Krisztián Szűcs
wrote:
>
> On Thu, Jun 4, 2020 at 11:09 AM Rémi Dettai wrote:
> >
> > It makes sense to me that the default behaviour of such a low level api as
> > kernel does not do any automagic promotion, but shouldn't this kind of
> > promotion still be
Arrow Build Report for Job nightly-2020-06-04-0
All tasks:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-06-04-0
Failed Tasks:
- centos-7-aarch64:
URL:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-06-04-0-travis-centos-7-aarch64
-
On Thu, Jun 4, 2020 at 11:09 AM Rémi Dettai wrote:
>
> It makes sense to me that the default behaviour of such a low level api as
> kernel does not do any automagic promotion, but shouldn't this kind of
> promotion still be requestable by the so called "system developer" user ?
> Otherwise he
It makes sense to me that the default behaviour of such a low level api as
kernel does not do any automagic promotion, but shouldn't this kind of
promotion still be requestable by the so called "system developer" user ?
Otherwise he would need to materialize a promoted version of each original
17 matches
Mail list logo