7/17/2019
Attendee:
Ryan Blue(Netflix)
Jame(Netflix)
Gidon Gershinsky(IBM)
Steven(Yelp)
Deepak and several other folks (Vertica)
Xinli Shang(Uber)
Junjie Chen
Topics:
1.
Column Encryption
1.
Gidon:
1.
C++ version code review: Have addressed all feedbacks. The last
step is testing. Hopefully tomorrow the testing can be done.
2.
Reviewed bloom filter design from Parquet encryption perspective.
It is straightforward.
3.
Not much done on Java version Parquet side. Worked with Xinli to
fix several issues.
4.
Found throughput issues in Java and fixed it.
2.
Xinli:
1.
Gidon sent out a design which consolidates different ways of
deploying parquet encryption, but not much attention is
gained from the
community. Please have a look if you are interested in.
2.
There is a discussion about unifying table properties in
HMS(HIVE-21848) for both ORC and Parquet column encryption.
Please chime in
if you have a concern.
3.
Java version parquet-mr PR review is being slow. How do we move
faster? We need more people to review it.
1.
https://github.com/apache/parquet-mr/pull/613
2.
https://github.com/apache/parquet-mr/pull/614
3.
https://github.com/apache/parquet-mr/pull/643
3.
Jim
1.
What is blocked on the parquet-mr review? We need more people to
review it. There is a lot of PR now.
4.
Deepak
1.
Does the parquet encryption work with Hive?
1.
Yes, we have tested it(xinli).
2.
Also have questions about table properties definition.
1.
HIVE-21848(xinli)
2. Bloom filter
1.
Junjie Chen
1.
We need one more PMC vote
2.
Ryan
1.
I will have a look next week. Were the issues raised earlier
addressed?
1.
Yes(Junjie)
2.
Parquet-format should be considered as upstream for parquet-cpp
and parquet-mr that are implementation.
3.
We need Encryption specification merge to parquet-format ASAP,
then bloom filter. Otherwise, parquet-format will depend
parquet-cpp and
parquet-mr, which is not right.
https://github.com/apache/parquet-format/pull/68
https://github.com/apache/parquet-format/pull/142
1.
Xinli
1.
Is parquet-format 2.6 + encryption compatible with parquet
2.7(encryption + bloom filter)?
1.
By design, yes(Gidon)
2.
Please add Xinli for testing if we have a prototype for bloom
filter to make sure they are compatible.
1.
Parquet-1.11.0 Release Validation
1.
Ryan
1.
Both Ryan and Zalton are very busy. No progress so far.
2.
We need to write a test to make sure the data write/read are
correct.
1.
Remove old Parquet modules
1.
Ryan
1.
No time. If somebody has time to do it, go for it.
--
Xinli Shang