Re: [VOTE] Clarify meaning of timestamp without time zone to equal the concept of "LocalDateTime"

2021-06-25 Thread Joris Peeters
+1 On Fri, Jun 25, 2021 at 9:29 AM Joris Van den Bossche < jorisvandenboss...@gmail.com> wrote: > +1 > > On Thu, 24 Jun 2021 at 21:21, Micah Kornfield > wrote: > > > +1 (binding) > > > > On Thu, Jun 24, 2021 at 12:17 PM Weston Pace > > wrote: > > > > > The discussion in [1] led to the

Re: [STRAW POLL] (How) should Arrow define storage for "Instant"s

2021-06-24 Thread Joris Peeters
C On Thu, Jun 24, 2021 at 8:39 PM Antoine Pitrou wrote: > > Option C. > > > Le 24/06/2021 à 21:24, Weston Pace a écrit : > > > > This proposal states that Arrow should define how to encode an Instant > > into Arrow data. There are several ways this could happen, some which > > change

Re: Long title on github page

2021-06-12 Thread Joris Peeters
+1 On Sat, Jun 12, 2021 at 2:56 PM Wes McKinney wrote: > Thanks Kou! I have updated the description using .asf.yaml. Appreciate > everyone giving thought to this! > > On Thu, Jun 10, 2021 at 8:13 PM Sutou Kouhei wrote: > > > > It seems that we can use .asf.yaml to set the description on > >

Re: [Format] Timestamp timezone semantics?

2021-06-02 Thread Joris Peeters
You could store epoch offsets, but interpret them in the local timezone. E.g. (0, "America/New_York") could mean 1970-01-01 00:00:00 in the New York timezone. At least one nasty problem with that is ambiguous times, i.e. when the clock turns back on going from DST to ST, as well as invalid times

Re: [DISCUSS] Revisiting LZ4 Compression for Arrow Buffers

2021-03-11 Thread Joris Peeters
"Is https://github.com/lz4/lz4-java the fast Java lz4 library in question? The incompleteness of this implementation is a known problem for other user communities, not only Arrow. It would be a great public service to improve it so that it fully implements the lz4 frame specification." Very much

Re: [Java] IPC stream write with re-stated dictionaries

2021-03-05 Thread Joris Peeters
gt; > Thanks, > Micah > > [1] > > https://github.com/apache/arrow/blob/master/java/vector/src/test/java/org/apache/arrow/vector/ipc/TestArrowReaderWriter.java#L614 > > On Thu, Mar 4, 2021 at 4:06 AM Joris Peeters > wrote: > > > Hello, > > > > F

[Java] IPC stream write with re-stated dictionaries

2021-03-04 Thread Joris Peeters
Hello, For my use case I'm sending an Arrow IPC-stream from a server to a client, with some columns being dictionary-encoded. Dictionary-encoding happens on the fly, though, so the full dictionary isn't known yet at the beginning of the stream, but rather is computed for every batch, and

Re: pyarrow: write table where columns share the same dictionary

2021-03-02 Thread Joris Peeters
Made https://issues.apache.org/jira/browse/ARROW-11838 to track. If someone adds me as a Contributor (Joris Peeters / jmgpeeters) I'm happy to assign it to myself. -J On Tue, Mar 2, 2021 at 9:34 AM Antoine Pitrou wrote: > > Hi Joris, > > On Mon, 1 Mar 2021 19:04:08 +0000 > Joris

Fwd: pyarrow: write table where columns share the same dictionary

2021-03-01 Thread Joris Peeters
message - From: Joris Peeters Date: Fri, Feb 26, 2021 at 10:11 AM Subject: Re: pyarrow: write table where columns share the same dictionary To: FWIW, in the Java client it's https://github.com/apache/arrow/blob/apache-arrow-3.0.0/java/vector/src/main/java/org/apache/arrow/vector/ipc

java arrow: memory management with multiple references to same batch

2021-01-29 Thread Joris Peeters
Hello, I'm writing an HTTP server in Java that provides Arrow data to users. For performance, I keep the most-recently-used Arrow batches in an in-memory cache. A batch is wrapped in a "DataBatch" Java object containing the schema and field vectors. I'm looking for a good memory management

Re: lz4 compressed arrow between Python & Java

2021-01-28 Thread Joris Peeters
ing > worked on (I would need to double check if the PR has been merged) and I > don't think its been integration tested with C++/Python I would imagine it > would run into a similar issue with not being able to decode linked blocks. > > On Thu, Jan 28, 2021 at 10:19 AM Jori

Re: lz4 compressed arrow between Python & Java

2021-01-28 Thread Joris Peeters
Antoine Pitrou > wrote: > > > > > Le 28/01/2021 à 17:59, Joris Peeters a écrit : > > > From Python, I'm dumping an LZ4-compressed arrow stream to a file, > using > > > > > > with pa.output_stream(path, compression = 'lz4') a

lz4 compressed arrow between Python & Java

2021-01-28 Thread Joris Peeters
>From Python, I'm dumping an LZ4-compressed arrow stream to a file, using with pa.output_stream(path, compression = 'lz4') as fh: writer = pa.RecordBatchStreamWriter(fh, table.schema) writer.write_table(table) writer.close() I then try reading this file from Java,

Re: Flight: beginner questions for usage

2020-06-23 Thread Joris Peeters
org/docs/format/Flight.html While it's labeled > "Format", it contains an example of a Flight request flow. > > Best, > David > > On 6/23/20, Joris Peeters wrote: > > Hello, > > > > I'm interested in using Flight for serving large amounts of data in a > >

Flight: beginner questions for usage

2020-06-23 Thread Joris Peeters
Hello, I'm interested in using Flight for serving large amounts of data in a parallelised manner, and just building some Python prototypes, based on https://github.com/apache/arrow/blob/apache-arrow-0.17.1/python/examples/flight In my use-case, we'd have a bunch of worker servers, serving a

Re: Java/Scala: efficient reading of Parquet into Arrow?

2019-05-23 Thread Joris Peeters
e Spark, at least. I'm interested to see a > reusable library that supports vectorized Arrow reads in Java. > > - Wes > > [1]: https://github.com/dremio/dremio-oss > > On Thu, May 23, 2019 at 8:54 AM Joris Peeters > wrote: > > > > Hello, > > > > I'm

Java/Scala: efficient reading of Parquet into Arrow?

2019-05-23 Thread Joris Peeters
Hello, I'm trying to read a Parquet file from disk into Arrow in memory, in Scala. I'm wondering what the most efficient approach is, especially for the reading part. I'm aware that Parquet reading is perhaps beyond the scope of this mailing list but, - I believe Arrow and Parquet are closely

Re: MATLAB, Arrow, ABI's and Linux

2019-03-13 Thread Joris Peeters
I think > > if you use gcc 6.3 for building both gsa-arrow-cpp and arrow-matlab we > > might be able to eliminate the LD_PRELOAD, which I think should be > > nicer for your end-users. What was the error you saw without > > LD_PRELOAD? > > > > Hatem > > > &g

Re: MATLAB, Arrow, ABI's and Linux

2019-03-13 Thread Joris Peeters
se > > recipes. > > > > If you need shared libaries using the gcc 4.x ABI you may have to > > build them yourself right, or use the Linux packages for the platform > > where you are working. It would be useful to have a Dockerfile that > > produces "portable

MATLAB, Arrow, ABI's and Linux

2019-03-12 Thread Joris Peeters
[A] Short background: We are working on a MEX library that converts a binary array (representing an Arrow stream or file) into MATLAB structs. This is in parallel/complement to what already exists in the main Arrow project, which focuses on feather, but the hope is certainly to contribute back

Re: undefined reference to `arrow::Status::ToString[abi:cxx11]() const'

2018-08-31 Thread Joris Peeters
Arrow libraries from source on your compiler > * Pass -D_GLIBCXX_USE_CXX11_ABI=0 to gcc when compiling all object > code with direct or indirect linkage (e.g. an std::string created > someplace else will be ABI-incompatible) to conda-forge binaries > > - Wes > On Fri, Aug 31,

undefined reference to `arrow::Status::ToString[abi:cxx11]() const'

2018-08-31 Thread Joris Peeters
I am trying to compile a small piece of C++ code, linking against the arrow libraries which I retrieved through Anaconda (conda install -c conda-forge arrow-cpp). The minified code for test.cpp looks like this, >> #include #include void checkStatus(arrow::Status const ) { if (!status.ok())

Re: Arrow for MATLAB?

2018-02-14 Thread Joris Peeters
thworks.com/help/matlab/matlab-data-array.html that once > > someone attempts it, it should not be hard to build. > > > > If you want to try to take a shot, we are happy to help if there are > > problems with the Arrow side of things. > > > > Uwe > > > > O