[
https://issues.apache.org/jira/browse/HUDI-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193086#comment-17193086
]
Vinoth Chandar commented on HUDI-1267:
--------------------------------------
ah got it. there was a proposal for a UI on top that reads across tables. this
is worth discussing again on the mailing list.
This was the rough approach.
# We run a long running instance of TimelineServer and have all the writers
to each table report commits/have the server pull and materialize the table
metadata in local rocksDB
# We can then build REST Layer on top of it and hook up a UI.
> Additional Metadata Details for Hudi Transactions
> -------------------------------------------------
>
> Key: HUDI-1267
> URL: https://issues.apache.org/jira/browse/HUDI-1267
> Project: Apache Hudi
> Issue Type: Improvement
> Components: Usability
> Reporter: Ashish M G
> Priority: Major
> Labels: features
> Fix For: 0.7.0
>
>
> Whenever following scenarios happen :
> # Custom Datasource ( Kafka for instance ) -> Hudi Table
> # Hudi -> Hudi Table
> # s3 -> Hudi Table
> Following metadata need to be captured :
> # Table Level Metadata
> *
> ** Operation name ( record level ) like Upsert, Insert etc for last
> operation performed on the row
> # Transaction Level Metadata ( This will be logged on Hudi Level and not
> Table Level )
> ** Source ( Kafka Topic Name / S3 url for source data in case of s3 etc )
> ** Target Hudi Table Name
> ** Last transaction time ( last commit time )
> Basically , point (1) collects all details on table level and point (2)
> collects all the transactions happened on Hudi Level
> Point(1) would be just a column addition for operation type
> Eg for Point (2) : Suppose we had an ingestion from Kafka topic 'A' to Hudi
> table 'ingest_kafka' and another ingestion from RDBMS table ( 'tableA' )
> through Sqoop to Hudi Table 'RDBMSingest' then the metadata captured would be
> :
>
> |Source|Timestamp|Transaction Type|Target|
> |Kafka - 'A'|XXXXXX|UPSERT|ingest_kafka|
> |RDBMS - 'tableA'|XXXXXX|INSERT|RDBMSingest|
>
> The Transaction Details Table in Point (2) should be available as a separate
> common table which can be queried as Hudi Table or stored as parquet which
> can be queried from Spark
--
This message was sent by Atlassian Jira
(v8.3.4#803005)