Hi Qian,
You are right on the choice of tools for 2 and 3. But for 1, if you want to
do a 1-time bulk load, you can look into options on the migration guide
http://hudi.apache.org/migration_guide.html (HiveSyncTool is orthogonal to
this, it simply registers a Hudi dataset to Hive metastore)
On
https://issues.apache.org/jira/browse/HUDI-288 tracks this
On Tue, Oct 1, 2019 at 10:17 AM Vinoth Chandar wrote:
>
> I think this has come up before.
>
> +1 to the point pratyaksh mentioned. I would like to add a few more
>
> - Schema could be fetched dynamically from a registry based on
Hi Kabeer,
I plan to do an incremental query PoC. My use case including:
1. Load one big Hive table located in HDFS to Hudi as a history table (I think
should use HiveSyncTool)
2. Sink streaming data from Kafka to Hudi as real time table(use
HoodieDeltaStreamer?)
3. Join both of two table get
Awesome!
On Wed, Oct 2, 2019 at 3:01 PM Gautam Nayak
wrote:
> Thanks Vinoth for the tip,We were able to fix the issue as our spark
> cluster(2.2.0) bundled both spark-streaming-kafka-0-8 and
> spark-streaming-kafka-0-10 jars. Getting rid of spark-streaming-kafka-0-10
> jars from the cluster
Thanks Vinoth for the tip,We were able to fix the issue as our spark
cluster(2.2.0) bundled both spark-streaming-kafka-0-8 and
spark-streaming-kafka-0-10 jars. Getting rid of spark-streaming-kafka-0-10 jars
from the cluster resolved the ClasscastException.
On Oct 1, 2019, at 10:25 AM, Vinoth
Qian
Welcome!
Are you able to tell us a bit more about your use case? Eg: type of the
project, industry, complexity of the pipeline that you plan to write (eg:
pulling data from external APIs like New York taxi dataset and writing them
into Hive for analysis) etc.
This will give us a bit more
edit:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=113709185#Frequentlyaskedquestions(FAQ)-HowisaHudijobdeployed?
with the ? at the end
On Wed, Oct 2, 2019 at 2:54 PM Vinoth Chandar wrote:
> Hi Qian,
>
> Welcome! Does
>
Hi Qian,
Welcome! Does
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=113709185#Frequentlyaskedquestions(FAQ)-HowisaHudijobdeployed?
help ?
On Wed, Oct 2, 2019 at 10:18 AM Qian Wang wrote:
> Hi,
>
> I am new to Apache Hudi. Currently I am working on a PoC using Hudi and
>
Hi,
I am new to Apache Hudi. Currently I am working on a PoC using Hudi and anyone
can give me some documents what how to deploy Apache Hudi? Thanks.
Best,
Eric
This week I have limited internet access and would not be able to help
much.
On Wed, Oct 2, 2019 at 13:26 Thomas Weise wrote:
> I looked at the PR and I see a disturbing number of LICENSE file
> repetitions in it. There should be no need for that as LICENSE can be
> included automatically by
Based on some conversations I had with Flink folks including Hudi's very
own mentor Thomas, it seems future proof to look into supporting the Flink
streaming APIs. The batch apis IIUC will move towards converging with
Streaming APIs, which matches Hudi's model anyway
>From Hudi's perspective,
11 matches
Mail list logo