-1 (binding)
I'm voting -1 on this. I posted the thinking why on the PR. The high-level
is that I think it needs to better address the pipelined use case as right
now it fails to support that at all and has too much weight to ignore that
use case.
I actually would have posted it here but totally
Karl Dunkle Werner created ARROW-7345:
-
Summary: [Python] Writing partitions with NaNs silently drops data
Key: ARROW-7345
URL: https://issues.apache.org/jira/browse/ARROW-7345
Project: Apache
Neal Richardson created ARROW-7344:
--
Summary: [Packaging][Python] Build manylinux2014 wheels
Key: ARROW-7344
URL: https://issues.apache.org/jira/browse/ARROW-7344
Project: Apache Arrow
Hi Christian,
As far as I know no-one is working on a canonical text representation for
schemas. A JSON serializer exists for integration test purposes, but
IMO it shouldn't be relied upon as canonical.
It looks like Flatbuffers supports serialization to/from JSON [1
Hello,
Could more PMC members take a look at this work?
Thank you
On Tue, Dec 3, 2019 at 1:50 PM Neal Richardson
wrote:
>
> +1 (non-binding)
>
> On Tue, Dec 3, 2019 at 10:56 AM Wes McKinney wrote:
>
> > +1 (binding)
> >
> > On Tue, Dec 3, 2019 at 12:54 PM Wes McKinney wrote:
> > >
> > >
David Li created ARROW-7343:
---
Summary: Memory leak in Flight ArrowMessage
Key: ARROW-7343
URL: https://issues.apache.org/jira/browse/ARROW-7343
Project: Apache Arrow
Issue Type: Bug
Hi,
For the uses I would like to make of Arrow, I would need a human-readable
and -writable version of an Arrow Schema, that could be converted to and
from the Arrow Schema C++ object. Going through the doc for 0.15.1, I don't
see anything to that effect, with the closest being the ToString()
Steve M. Kim created ARROW-7342:
---
Summary: [Java] offset buffer for vector of variable-width type
with zero value count is empty
Key: ARROW-7342
URL: https://issues.apache.org/jira/browse/ARROW-7342
Neal Richardson created ARROW-7341:
--
Summary: [CI] Unbreak nightly Conda R job
Key: ARROW-7341
URL: https://issues.apache.org/jira/browse/ARROW-7341
Project: Apache Arrow
Issue Type: Bug
Thanks. I similarly noticed that uint32 gets converted to int64. This
makes some surface sense as uint32 is a logical type with int64 as the
backing physical type. However, uint8, uint16, and uint64 all keep their
data types so I was a little surprised.
On Fri, Dec 6, 2019 at 6:52 AM Wes
Neal Richardson created ARROW-7340:
--
Summary: [CI] Prune defunct appveyor build setup
Key: ARROW-7340
URL: https://issues.apache.org/jira/browse/ARROW-7340
Project: Apache Arrow
Issue Type:
Francois Saint-Jacques created ARROW-7339:
-
Summary: [CMake] Thrift version not respected in CMake
configuration version.txt
Key: ARROW-7339
URL: https://issues.apache.org/jira/browse/ARROW-7339
Some notes
* 96-bit nanosecond timestamps are deprecated in the Parquet format by
default, so we don't write them by default unless you use the
use_deprecated_int96_timestamps flag
* 64-bit timestamps are relatively new to the Parquet format, I'm not
actually sure what's required to write these.
Francois Saint-Jacques created ARROW-7338:
-
Summary: [C++] Rename SimpleDataSource to InMemoryDataSource
Key: ARROW-7338
URL: https://issues.apache.org/jira/browse/ARROW-7338
Project: Apache
If my table has timestamp fields with ns resolution and I save the table to
parquet format without specifying any timestamp args (default coerce and
legacy settings) then it automatically converts my timestamp to us
resolution.
As best I can tell Parquet supports ns resolution so I would prefer
Arrow Build Report for Job nightly-2019-12-06-0
All tasks:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-06-0
Failed Tasks:
- test-conda-python-3.7-hdfs-2.9.2:
URL:
Hi Wes and Liya,
Appreciate your feedback and information.
Looking forward to a more efficient integration between Arrow and Spark on
the Java/Scala level. I would like to make my contribution if I can help in
any way during my free time.
Thank you very much.
*Best Regards,WANG GAOXIANG*
*
Krisztian Szucs created ARROW-7337:
--
Summary: [CI][C++] Excersive benchmarks as GitHub actions cron job
Key: ARROW-7337
URL: https://issues.apache.org/jira/browse/ARROW-7337
Project: Apache Arrow
Yuan Zhou created ARROW-7336:
Summary: implement minmax options
Key: ARROW-7336
URL: https://issues.apache.org/jira/browse/ARROW-7336
Project: Apache Arrow
Issue Type: Improvement
Hi folks,
Thanks for your clarification.
I also think this is a universal requirement (including Java UDF in Arrow
format).
The Java converter provided by Spark is inefficient, due to two reasons
(IMO)
1. There are frequent memory copies between on-heap and off-heap memory.
2. The Spark API is
20 matches
Mail list logo