hi folks,

I'm extraordinarily busy but it seems likely that the code review for
Parquet C++ is largely going to fall on my shoulders. Since this is a
large patch with many touch-points in the C++ codebase, and there
hasn't been much feedback over the course of its development, I would
like to leave thoughtful code review so that we can avoid issues that
may lead to burdensome maintenance issues.

On the OpenSSL issue -- there are many CMake projects in the wild that
use OpenSSL and there are many examples of how to incorporate other
dependencies into the Arrow build system. I don't have time to work on
this right now so someone else will have to sort out the toolchain
issue

Thanks

On Mon, May 20, 2019 at 8:48 AM Gidon Gershinsky <[email protected]> wrote:
>
> <re-sending from another account>
>
> Tham, thank you for this! and for volunteering early for the C++ version
> work, driving it forward and creating the bulk of the parquet-cpp
> encryption code along the way.
>
> @All - this announcement means that two implementations of Parquet
> encryption, fully conforming to the formal specification, are available
> today. Thanks to Revital for contributing to C++ version compliance with
> the encryption spec, and for running a set of basic Java-C++ encryption
> interoperability tests. We have tested plaintext and encrypted footer
> modes, GCM and GCM_CTR algorithms, new and legacy readers. Files written
> with parquet-cpp are successfully parsed by parquet-mr, and vice versa.
>
> Let me also thank Junjie, Nandor, Anna and Xinli for their support and vote
> for the encryption specification - along with the PMC folks.
>
> All parquet-format pull requests are merged by now into the encryption
> branch,
> https://github.com/apache/parquet-format/tree/encryption
>
> The community is welcome to review the parquet-mr pull requests, in the
> following order:
> https://github.com/apache/parquet-mr/pull/613
> https://github.com/apache/parquet-mr/pull/614
> https://github.com/apache/parquet-mr/pull/643
>
>
> Currently, an end-to-end implementation of Java (mr) Parquet encryption is
> collected in this branch:
> https://github.com/ggershinsky/parquet-mr/tree/encr
> Thanks to Xinli for working with this branch code, and contributing to it
> based on his field experience. Everybody is welcome to do the same.
> @All - it would be helpful to review & merge the above PRs in
> apache/parquet-mr/encryption, so that folks can work with it instead of my
> private branch..
>
> And I certainly second Tham's call to review & merge the parquet-cpp pull
> requests. By now, we have a number of companies starting to utilize Parquet
> encryption (both C++ and Java), including IBM.
>
>
> Cheers, Gidon.
>
>
> On Mon, May 20, 2019 at 1:40 PM Tham Ha <[email protected]> wrote:
>
> > Hi community,
> >
> >
> >
> > After a long time of development, I'm honor to announce that we have just
> > completed C++ parquet encryption module which implements encryption in low
> > level api and with examples included.
> >
> >
> >
> > To have this feature completed, I would like to thank Gidon and Revital for
> > their contribution.
> >
> > Gidon had a key role in encryption design and in writing Java version code
> > on which we based on to write C++ version. He also wrote crypto package in
> > C++ version.
> >
> > Revital and me has been joining together in writing C++ version. Revital
> > was responsible for AAD calculations, API updating (to be the same with
> > Java version) and Java-C++ inter-operation testing. I was writing the first
> > draft (properties, metadata, writer, reader) and keeping them updated when
> > crypto package change.
> >
> > We have had a great time to cooperate. Thank Gidon and Revital for all
> > guide and experience I have received from them, too.
> >
> >
> >
> > Here are the links of pull requests:
> >
> > 1) encryption module (properties, metadata, writer, reader):
> > https://github.com/apache/arrow/pull/2555.
> >
> > 2) some merged pull requests for new thrift structure and crypto algorithm,
> > and one still open: https://github.com/apache/arrow/pull/3520
> >
> >
> >
> > However, in order to make (1) buildable with current build scripts, we need
> > “adding openssl in C++ build toolchain” which is mentioned in this jira:
> > https://issues.apache.org/jira/browse/ARROW-4302. I will be grateful if
> > someone could help fullfill this work.
> >
> >
> >
> > About current pull requests, they has been currently using in our
> > development phase at Emotiv (https://www.emotiv.com/). We love using
> > parquet files to store EEG data. We are going to release a product with
> > encrypted parquet files soon and look forward to the official release of
> > parquet encryption feature. So it will be many thank and great honor to
> > have you review and merge them (if qualified).
> >
> >
> >
> > Thank you very much!
> >
> > Tham
> >

Reply via email to