Hi everyone,
For a long time, in the field of big data, people hope that the tools they
use can give greater play to the processing and analysis capabilities of
big data. At present, from the perspective of API, Hudi mostly provides
APIs related to data ingestion, and relies on various big data
Hello All,
Hive has the bucketBy feature and spark is going to add support for HIVE style
bucketBy support for data sources and once it’s implemented - its going to
benefit largely on the read performance. So as HUDI is having different path
while writing parquet data, are we planning to add
Dear community,
Nice to share Hudi community weekly update for 2020-08-23 ~ 2020-08-30 with
updates on discussion, features, bugfixs.
===
Discussion
[Release] Hudi 0.6.0 has been released, it contains many features and
bugfixes [1]
As Hudi matures as a project, we need to get our devX and test infra rock
solid. Availability of test utils and base classes for ease of writing more
tests, stable integration tests, ease of debuggability, micro benchmarks,
performance test infra, automating checkstyle formatting, nightly snapshot