Thanks all for your discussions.
I'll share my opinion here:
1. Hive SQL and Hive-like SQL are the absolute mainstay of current
Batch ETL in China. Hive+Spark (HiveSQL-like)+Databricks also occupies
a large market worldwide.
- Unlike OLAP SQL (such as presto, which is ansi-sql rather than hive
Hi Martijn,
Thanks for bringing this up.
Hive SQL (using in Hive & Spark) plays an important role in batch processing,
it has almost become de facto standard in batch processing. In our company,
there are hundreds of thousands of spark jobs each day.
IMO, if we want to promote Flink batch, Hive
Hi Martijn,
Thanks for starting this discussion. I think it's great
for the community to to reach a consensus on the roadmap
of Hive query syntax.
I agree that the Hive project is not actively developed nowadays.
However, Hive still occupies the majority of the batch market
and the Hive
Hi,
Thanks Martijn for driving this discussion. Your concerns are very
rational.
We should do our best to keep the Flink development on the right track. I
would suggest discussing it in a vision/goal oriented way. Since Flink has
a clear vision of unified batch and stream processing, supporting
Hi Martijn,
Thanks for driving this discussion.
+1 on efforts on more hive syntax compatibility.
With the efforts on batch processing in recent versions(1.10~1.15), many
users have run batch processing jobs based on Flink.
In our team, we are trying to migrate most of the existing online batch