Re: Hudi - Concurrent Writes

2020-07-08 Thread Mario de Vera
hey Shayan,

that seems actually a very good approach ... just curious with the glue
metastore you mentioned. Would it be an external metastore for spark to
query over ??? external in terms of not managed by Hudi ???

that would be my only concern ... how to maintain the sync between all
metadata partitions but , again, a very promising approach !

regards,

Mario.

Em qua., 8 de jul. de 2020 às 15:20, Shayan Hati 
escreveu:

> Hi folks,
>
> We have a use-case where we want to ingest data concurrently for different
> partitions. Currently Hudi doesn't support concurrent writes on the same
> Hudi table.
>
> One of the approaches we were thinking was to use one hudi table per
> partition of data. So let us say we have 1000 partitions, we will have 1000
> Hudi tables which will enable us to write concurrently on each partition.
> And the metadata for each partition will be synced to a single metastore
> table (Assumption here is schema is same for all partitions). So this
> single metastore table can be used for all the spark, hive queries when
> querying data. Basically this metastore glues all the different hudi table
> data together in a single table.
>
> We already tested this approach and its working fine and each partition
> will have its own timeline and hudi table.
>
> We wanted to know if there are some gotchas or any other issues with this
> approach to enable concurrent writes? Or if there are any other approaches
> we can take?
>
> Thanks,
> Shayan
>


Re: [DISCUSS] Publishing benchmarks for releases

2020-06-22 Thread Mario de Vera
+1 for performance reports

On Mon, 22 Jun 2020, 02:41 vino yang,  wrote:

> +1 as well,
>
> it would be helpful to measure the performance between different versions.
>
> Shiyan Xu  于2020年6月22日周一 上午8:37写道:
>
> > +1 definitely useful info.
> >
> > On Sun, Jun 21, 2020 at 4:56 PM Sivabalan  wrote:
> >
> > > Hey folks,
> > > Is it a common practise to publish benchmarks for releases? I have
> > put
> > > up an initial PR  to add jmh
> > > benchmark support to a couple of Hudi operations. If the community
> feels
> > > positive on publishing benchmarks, we can add support for more
> operations
> > > and for every release, we could publish some benchmark numbers.
> > >
> > > --
> > > Regards,
> > > -Sivabalan
> > >
> >
>


Re: How to extend the timeline server schema to accommodate business metadata

2020-06-10 Thread Mario de Vera
great ! will definitely follow up...

Em qua., 10 de jun. de 2020 às 19:28, Bhavani Sudha 
escreveu:

> Ah okay. Thanks for letting us know. I created a Jira here to capture this
> thread - https://issues.apache.org/jira/browse/HUDI-1020. Feel free to add
> to the jira.
>
> Thanks,
> Sudha
>
> On Wed, Jun 10, 2020 at 11:03 AM Mario de Sá Vera 
> wrote:
>
> > Sure Sudha, I am afraid I am not allowed to become a Hudi contributor
> > unfortunately ... but restrict myself to be an enthusiastic as my current
> > employer applies some severe restrictions.
> >
> > I would be more than happy to contribute by specifying the requirements
> but
> > from a code developer perspective I will have to pass that for now...
> >
> > Em qua., 10 de jun. de 2020 às 18:40, Bhavani Sudha <
> > bhavanisud...@gmail.com>
> > escreveu:
> >
> > > Definitely. I was trying to add you to the Hudi contributors so you can
> > > create a Jira . For that I need a jira id. If you have not already
> signed
> > > up, please sign up for Jira and let me know your jira id.
> > >
> > > Thanks,
> > > Sudha
> > >
> > > On Wed, Jun 10, 2020 at 12:17 AM Mario de Sá Vera 
> > > wrote:
> > >
> > > > Hi Sudha,
> > > >
> > > > Can you or Vinoth help me with this? How can we create a JIRA for
> that
> > ?
> > > >
> > > > I can collaborate bringing the description and definition of done.
> > > >
> > > > Thanks,
> > > >
> > > > Mario.
> > > >
> > > > On Tue, 9 Jun 2020, 23:46 Bhavani Sudha, 
> > > wrote:
> > > >
> > > > > Hi Mario,
> > > > >
> > > > > Can you please share your jira id ?
> > > > >
> > > > > Thanks,
> > > > > Sudha
> > > > >
> > > > > On Tue, Jun 9, 2020 at 3:29 AM Mario de Sá Vera <
> desav...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > hey Vinoth, I noticed you added this suggestion to the weekly log
> > ..
> > > > that
> > > > > > is great ! just let me know if I am able to create a JIRA , as I
> > > tried
> > > > to
> > > > > > go to HUDI project in Apache and did not find a way to do it. I
> can
> > > > bring
> > > > > > in a good description of the benefits etc...
> > > > > >
> > > > > > thanks, Mario.
> > > > > >
> > > > > > Em seg., 8 de jun. de 2020 às 12:46, Vinoth Chandar <
> > > vin...@apache.org
> > > > >
> > > > > > escreveu:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > We can probably make a new JIRA. Not sure if there is an
> existing
> > > > JIRA
> > > > > to
> > > > > > > re-use.
> > > > > > > The Following modules are good to look at.
> > > > > > >
> > > > > > > hudi-timeline-service
> > > > > > > packaging/hudi-timeline-server-bundle
> > > > > > >
> > > > > > > Thanks
> > > > > > > Vinoth
> > > > > > >
> > > > > > > On Fri, Jun 5, 2020 at 12:56 AM Mario de Sá Vera <
> > > desav...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Sorry Vinoth for not being clear... If that is a work in
> > progress
> > > > > would
> > > > > > > you
> > > > > > > > have a jira I could follow up and contribute to ? If not ,
> what
> > > is
> > > > > the
> > > > > > > > module name you suggest me looking at?
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > >
> > > > > > > > Mario.
> > > > > > > >
> > > > > > > > On Fri, 5 Jun 2020, 02:12 Vinoth Chandar,  >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Sorry did not understand the last part. :) are you
> suggesting
> > > we
> > > > > > > create a
> > > > > > > > > ji

Re: How to extend the timeline server schema to accommodate business metadata

2020-06-10 Thread Mario de Vera
Sure Sudha, I am afraid I am not allowed to become a Hudi contributor
unfortunately ... but restrict myself to be an enthusiastic as my current
employer applies some severe restrictions.

I would be more than happy to contribute by specifying the requirements but
from a code developer perspective I will have to pass that for now...

Em qua., 10 de jun. de 2020 às 18:40, Bhavani Sudha 
escreveu:

> Definitely. I was trying to add you to the Hudi contributors so you can
> create a Jira . For that I need a jira id. If you have not already signed
> up, please sign up for Jira and let me know your jira id.
>
> Thanks,
> Sudha
>
> On Wed, Jun 10, 2020 at 12:17 AM Mario de Sá Vera 
> wrote:
>
> > Hi Sudha,
> >
> > Can you or Vinoth help me with this? How can we create a JIRA for that ?
> >
> > I can collaborate bringing the description and definition of done.
> >
> > Thanks,
> >
> > Mario.
> >
> > On Tue, 9 Jun 2020, 23:46 Bhavani Sudha, 
> wrote:
> >
> > > Hi Mario,
> > >
> > > Can you please share your jira id ?
> > >
> > > Thanks,
> > > Sudha
> > >
> > > On Tue, Jun 9, 2020 at 3:29 AM Mario de Sá Vera 
> > > wrote:
> > >
> > > > hey Vinoth, I noticed you added this suggestion to the weekly log ..
> > that
> > > > is great ! just let me know if I am able to create a JIRA , as I
> tried
> > to
> > > > go to HUDI project in Apache and did not find a way to do it. I can
> > bring
> > > > in a good description of the benefits etc...
> > > >
> > > > thanks, Mario.
> > > >
> > > > Em seg., 8 de jun. de 2020 às 12:46, Vinoth Chandar <
> vin...@apache.org
> > >
> > > > escreveu:
> > > >
> > > > > Hi,
> > > > >
> > > > > We can probably make a new JIRA. Not sure if there is an existing
> > JIRA
> > > to
> > > > > re-use.
> > > > > The Following modules are good to look at.
> > > > >
> > > > > hudi-timeline-service
> > > > > packaging/hudi-timeline-server-bundle
> > > > >
> > > > > Thanks
> > > > > Vinoth
> > > > >
> > > > > On Fri, Jun 5, 2020 at 12:56 AM Mario de Sá Vera <
> desav...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Sorry Vinoth for not being clear... If that is a work in progress
> > > would
> > > > > you
> > > > > > have a jira I could follow up and contribute to ? If not , what
> is
> > > the
> > > > > > module name you suggest me looking at?
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Mario.
> > > > > >
> > > > > > On Fri, 5 Jun 2020, 02:12 Vinoth Chandar, 
> > wrote:
> > > > > >
> > > > > > > Sorry did not understand the last part. :) are you suggesting
> we
> > > > > create a
> > > > > > > jira
> > > > > > >
> > > > > > > On Thu, Jun 4, 2020 at 1:08 AM Mario de Sá Vera <
> > > desav...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > That sounds great ! Will check that and keep an eye on the
> long
> > > > > running
> > > > > > > > server approach... once it gets a ticket I could watch for
> just
> > > let
> > > > > me
> > > > > > > know
> > > > > > > > please.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, 4 Jun 2020, 05:34 Vinoth Chandar,  >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Mario,
> > > > > > > > >
> > > > > > > > > We actually started with the idea of making the timeline
> > > server,
> > > > a
> > > > > > long
> > > > > > > > > running service.  We have a module if you notice that
> builds
> > > our
> > > > a
> > > > > > > bundle
> > > > > > > > > that you could deploy. May be you can play with it and see
> if
>

Re: How to extend the timeline server schema to accommodate business metadata

2020-06-10 Thread Mario de Vera
Hi Sudha,

Can you or Vinoth help me with this? How can we create a JIRA for that ?

I can collaborate bringing the description and definition of done.

Thanks,

Mario.

On Tue, 9 Jun 2020, 23:46 Bhavani Sudha,  wrote:

> Hi Mario,
>
> Can you please share your jira id ?
>
> Thanks,
> Sudha
>
> On Tue, Jun 9, 2020 at 3:29 AM Mario de Sá Vera 
> wrote:
>
> > hey Vinoth, I noticed you added this suggestion to the weekly log .. that
> > is great ! just let me know if I am able to create a JIRA , as I tried to
> > go to HUDI project in Apache and did not find a way to do it. I can bring
> > in a good description of the benefits etc...
> >
> > thanks, Mario.
> >
> > Em seg., 8 de jun. de 2020 às 12:46, Vinoth Chandar 
> > escreveu:
> >
> > > Hi,
> > >
> > > We can probably make a new JIRA. Not sure if there is an existing JIRA
> to
> > > re-use.
> > > The Following modules are good to look at.
> > >
> > > hudi-timeline-service
> > > packaging/hudi-timeline-server-bundle
> > >
> > > Thanks
> > > Vinoth
> > >
> > > On Fri, Jun 5, 2020 at 12:56 AM Mario de Sá Vera 
> > > wrote:
> > >
> > > > Sorry Vinoth for not being clear... If that is a work in progress
> would
> > > you
> > > > have a jira I could follow up and contribute to ? If not , what is
> the
> > > > module name you suggest me looking at?
> > > >
> > > > Regards,
> > > >
> > > > Mario.
> > > >
> > > > On Fri, 5 Jun 2020, 02:12 Vinoth Chandar,  wrote:
> > > >
> > > > > Sorry did not understand the last part. :) are you suggesting we
> > > create a
> > > > > jira
> > > > >
> > > > > On Thu, Jun 4, 2020 at 1:08 AM Mario de Sá Vera <
> desav...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > That sounds great ! Will check that and keep an eye on the long
> > > running
> > > > > > server approach... once it gets a ticket I could watch for just
> let
> > > me
> > > > > know
> > > > > > please.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > >
> > > > > > On Thu, 4 Jun 2020, 05:34 Vinoth Chandar, 
> > wrote:
> > > > > >
> > > > > > > Hi Mario,
> > > > > > >
> > > > > > > We actually started with the idea of making the timeline
> server,
> > a
> > > > long
> > > > > > > running service.  We have a module if you notice that builds
> our
> > a
> > > > > bundle
> > > > > > > that you could deploy. May be you can play with it and see if
> > that
> > > > > sounds
> > > > > > > interesting to you. It will definitely have some rough edges
> > given
> > > > it’s
> > > > > > not
> > > > > > > been widely used.
> > > > > > >
> > > > > > > Thanks
> > > > > > > Vinoth
> > > > > > >
> > > > > > > On Wed, Jun 3, 2020 at 2:33 AM Mario de Sá Vera <
> > > desav...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Vinoth, thanks for your comments on this. I spent sometime
> > > > > thinking
> > > > > > > over
> > > > > > > > another possibility which would be externalising the Hudi
> > > timeline
> > > > > > > service
> > > > > > > > itself to an external server holding both operational (ie
> Hudi)
> > > and
> > > > > > > > business metadata.
> > > > > > > >
> > > > > > > > would you guys have any opinion on that ? would that be easy
> > as I
> > > > do
> > > > > > not
> > > > > > > > seem to see a way yet , except reading about RocksDB but that
> > is
> > > > > still
> > > > > > > not
> > > > > > > > quite clear.
> > > > > > > >
> > > > > > > > best regards,
> > > > > > > >
> > > > > > > > Mario.
> > > > > > > >
> > > >

Re: How to extend the timeline server schema to accommodate business metadata

2020-06-09 Thread Mario de Vera
hey Vinoth, I noticed you added this suggestion to the weekly log .. that
is great ! just let me know if I am able to create a JIRA , as I tried to
go to HUDI project in Apache and did not find a way to do it. I can bring
in a good description of the benefits etc...

thanks, Mario.

Em seg., 8 de jun. de 2020 às 12:46, Vinoth Chandar 
escreveu:

> Hi,
>
> We can probably make a new JIRA. Not sure if there is an existing JIRA to
> re-use.
> The Following modules are good to look at.
>
> hudi-timeline-service
> packaging/hudi-timeline-server-bundle
>
> Thanks
> Vinoth
>
> On Fri, Jun 5, 2020 at 12:56 AM Mario de Sá Vera 
> wrote:
>
> > Sorry Vinoth for not being clear... If that is a work in progress would
> you
> > have a jira I could follow up and contribute to ? If not , what is the
> > module name you suggest me looking at?
> >
> > Regards,
> >
> > Mario.
> >
> > On Fri, 5 Jun 2020, 02:12 Vinoth Chandar,  wrote:
> >
> > > Sorry did not understand the last part. :) are you suggesting we
> create a
> > > jira
> > >
> > > On Thu, Jun 4, 2020 at 1:08 AM Mario de Sá Vera 
> > > wrote:
> > >
> > > > That sounds great ! Will check that and keep an eye on the long
> running
> > > > server approach... once it gets a ticket I could watch for just let
> me
> > > know
> > > > please.
> > > >
> > > > Thanks
> > > >
> > > >
> > > > On Thu, 4 Jun 2020, 05:34 Vinoth Chandar,  wrote:
> > > >
> > > > > Hi Mario,
> > > > >
> > > > > We actually started with the idea of making the timeline server, a
> > long
> > > > > running service.  We have a module if you notice that builds our a
> > > bundle
> > > > > that you could deploy. May be you can play with it and see if that
> > > sounds
> > > > > interesting to you. It will definitely have some rough edges given
> > it’s
> > > > not
> > > > > been widely used.
> > > > >
> > > > > Thanks
> > > > > Vinoth
> > > > >
> > > > > On Wed, Jun 3, 2020 at 2:33 AM Mario de Sá Vera <
> desav...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Vinoth, thanks for your comments on this. I spent sometime
> > > thinking
> > > > > over
> > > > > > another possibility which would be externalising the Hudi
> timeline
> > > > > service
> > > > > > itself to an external server holding both operational (ie Hudi)
> and
> > > > > > business metadata.
> > > > > >
> > > > > > would you guys have any opinion on that ? would that be easy as I
> > do
> > > > not
> > > > > > seem to see a way yet , except reading about RocksDB but that is
> > > still
> > > > > not
> > > > > > quite clear.
> > > > > >
> > > > > > best regards,
> > > > > >
> > > > > > Mario.
> > > > > >
> > > > > > Em seg., 1 de jun. de 2020 às 16:01, Vinoth Chandar <
> > > > > > mail.vinoth.chan...@gmail.com> escreveu:
> > > > > >
> > > > > > > Hi Mario,
> > > > > > >
> > > > > > > Thanks for the detailed explanation. Hudi already allows extra
> > > > metadata
> > > > > > to
> > > > > > > be written atomically with each commit i.e write operation. In
> > > fact,
> > > > > that
> > > > > > > is how we track checkpoints for our delta streamer tool.. It
> may
> > > not
> > > > > > solve
> > > > > > > the need for querying the data together with this information.
> > but
> > > > > gives
> > > > > > > you ability to do some basic tagging.. if thats useful
> > > > > > >
> > > > > > > >>If we enable the timeline service metadata model to be
> extended
> > > we
> > > > > > could
> > > > > > > use the service instance itself to support specialised queries
> > that
> > > > > > involve
> > > > > > > business qualifiers in order to return a proper set of metadata
> > > > > pointing
> > > > > > to
> > > 

Re: How to extend the timeline server schema to accommodate business metadata

2020-06-05 Thread Mario de Vera
Sorry Vinoth for not being clear... If that is a work in progress would you
have a jira I could follow up and contribute to ? If not , what is the
module name you suggest me looking at?

Regards,

Mario.

On Fri, 5 Jun 2020, 02:12 Vinoth Chandar,  wrote:

> Sorry did not understand the last part. :) are you suggesting we create a
> jira
>
> On Thu, Jun 4, 2020 at 1:08 AM Mario de Sá Vera 
> wrote:
>
> > That sounds great ! Will check that and keep an eye on the long running
> > server approach... once it gets a ticket I could watch for just let me
> know
> > please.
> >
> > Thanks
> >
> >
> > On Thu, 4 Jun 2020, 05:34 Vinoth Chandar,  wrote:
> >
> > > Hi Mario,
> > >
> > > We actually started with the idea of making the timeline server, a long
> > > running service.  We have a module if you notice that builds our a
> bundle
> > > that you could deploy. May be you can play with it and see if that
> sounds
> > > interesting to you. It will definitely have some rough edges given it’s
> > not
> > > been widely used.
> > >
> > > Thanks
> > > Vinoth
> > >
> > > On Wed, Jun 3, 2020 at 2:33 AM Mario de Sá Vera 
> > > wrote:
> > >
> > > > Hi Vinoth, thanks for your comments on this. I spent sometime
> thinking
> > > over
> > > > another possibility which would be externalising the Hudi timeline
> > > service
> > > > itself to an external server holding both operational (ie Hudi) and
> > > > business metadata.
> > > >
> > > > would you guys have any opinion on that ? would that be easy as I do
> > not
> > > > seem to see a way yet , except reading about RocksDB but that is
> still
> > > not
> > > > quite clear.
> > > >
> > > > best regards,
> > > >
> > > > Mario.
> > > >
> > > > Em seg., 1 de jun. de 2020 às 16:01, Vinoth Chandar <
> > > > mail.vinoth.chan...@gmail.com> escreveu:
> > > >
> > > > > Hi Mario,
> > > > >
> > > > > Thanks for the detailed explanation. Hudi already allows extra
> > metadata
> > > > to
> > > > > be written atomically with each commit i.e write operation. In
> fact,
> > > that
> > > > > is how we track checkpoints for our delta streamer tool.. It may
> not
> > > > solve
> > > > > the need for querying the data together with this information. but
> > > gives
> > > > > you ability to do some basic tagging.. if thats useful
> > > > >
> > > > > >>If we enable the timeline service metadata model to be extended
> we
> > > > could
> > > > > use the service instance itself to support specialised queries that
> > > > involve
> > > > > business qualifiers in order to return a proper set of metadata
> > > pointing
> > > > to
> > > > > the related commits
> > > > >
> > > > > This is a good idea actually.. There is another active discuss
> thread
> > > on
> > > > > making the metadata queryable.. there is also
> > > > > https://issues.apache.org/jira/browse/HUDI-309 which we paused for
> > > now..
> > > > > But that's more in line with what you are thinking IIUC
> > > > >
> > > > >
> > > > > Thanks
> > > > > vinoth
> > > > >
> > > > > On Mon, Jun 1, 2020 at 4:41 AM Mario de Sá Vera <
> desav...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Balaji,
> > > > > >
> > > > > > business metadata are all types of info related to the business
> > where
> > > > the
> > > > > > Hudi solution is being used... from a COB (ie close of business
> > date)
> > > > > > related to that commit to any qualifier related to that commit
> that
> > > > might
> > > > > > be useful to be associated with that commit id. If we enable the
> > > > timeline
> > > > > > service metadata model to be extended we could use the service
> > > instance
> > > > > > itself to support specialised queries that involve business
> > > qualifiers
> > > > in
> > > > > > order to return a proper set of metadata pointing to the related
>

Re: How to extend the timeline server schema to accommodate business metadata

2020-06-03 Thread Mario de Vera
Hi Vinoth, thanks for your comments on this. I spent sometime thinking over
another possibility which would be externalising the Hudi timeline service
itself to an external server holding both operational (ie Hudi) and
business metadata.

would you guys have any opinion on that ? would that be easy as I do not
seem to see a way yet , except reading about RocksDB but that is still not
quite clear.

best regards,

Mario.

Em seg., 1 de jun. de 2020 às 16:01, Vinoth Chandar <
mail.vinoth.chan...@gmail.com> escreveu:

> Hi Mario,
>
> Thanks for the detailed explanation. Hudi already allows extra metadata to
> be written atomically with each commit i.e write operation. In fact, that
> is how we track checkpoints for our delta streamer tool.. It may not solve
> the need for querying the data together with this information. but gives
> you ability to do some basic tagging.. if thats useful
>
> >>If we enable the timeline service metadata model to be extended we could
> use the service instance itself to support specialised queries that involve
> business qualifiers in order to return a proper set of metadata pointing to
> the related commits
>
> This is a good idea actually.. There is another active discuss thread on
> making the metadata queryable.. there is also
> https://issues.apache.org/jira/browse/HUDI-309 which we paused for now..
> But that's more in line with what you are thinking IIUC
>
>
> Thanks
> vinoth
>
> On Mon, Jun 1, 2020 at 4:41 AM Mario de Sá Vera 
> wrote:
>
> > Hi Balaji,
> >
> > business metadata are all types of info related to the business where the
> > Hudi solution is being used... from a COB (ie close of business date)
> > related to that commit to any qualifier related to that commit that might
> > be useful to be associated with that commit id. If we enable the timeline
> > service metadata model to be extended we could use the service instance
> > itself to support specialised queries that involve business qualifiers in
> > order to return a proper set of metadata pointing to the related commits
> > that answer a business query.
> >
> > if we do not have that flexibility we might end up creating a external
> > transaction log and then comes the hard task to make that service in sync
> > to the timeline service.
> >
> > let me know if that makes sense to you,
> >
> > Mario.
> >
> > Em seg., 1 de jun. de 2020 às 06:55, Balaji Varadarajan
> >  escreveu:
> >
> > >  Hi Mario,
> > > Timeline Server was designed to serve hudi metadata for Hudi writers
> and
> > > readers.  it may not be suitable to serve arbitrary data. But, it is an
> > > interesting thought. Can you elaborate more on what kind of business
> > > metadata are you looking. Is this something you are planning to store
> in
> > > commit files ?
> > > Balaji.V
> > >
> > > On Sunday, May 31, 2020, 04:22:27 PM PDT, Mario de Sá Vera <
> > > desav...@gmail.com> wrote:
> > >
> > >  I see a need for extending the current timeline server schema so that
> a
> > > flexible model could be achieved in order to accommodate business
> > metadata.
> > >
> > > let me know if that makes sense to anyone here...
> > >
> > > Regards,
> > >
> > > Mario.
> > >
> >
>


Re: How to extend the timeline server schema to accommodate business metadata

2020-06-01 Thread Mario de Vera
Hi Balaji,

business metadata are all types of info related to the business where the
Hudi solution is being used... from a COB (ie close of business date)
related to that commit to any qualifier related to that commit that might
be useful to be associated with that commit id. If we enable the timeline
service metadata model to be extended we could use the service instance
itself to support specialised queries that involve business qualifiers in
order to return a proper set of metadata pointing to the related commits
that answer a business query.

if we do not have that flexibility we might end up creating a external
transaction log and then comes the hard task to make that service in sync
to the timeline service.

let me know if that makes sense to you,

Mario.

Em seg., 1 de jun. de 2020 às 06:55, Balaji Varadarajan
 escreveu:

>  Hi Mario,
> Timeline Server was designed to serve hudi metadata for Hudi writers and
> readers.  it may not be suitable to serve arbitrary data. But, it is an
> interesting thought. Can you elaborate more on what kind of business
> metadata are you looking. Is this something you are planning to store in
> commit files ?
> Balaji.V
>
> On Sunday, May 31, 2020, 04:22:27 PM PDT, Mario de Sá Vera <
> desav...@gmail.com> wrote:
>
>  I see a need for extending the current timeline server schema so that a
> flexible model could be achieved in order to accommodate business metadata.
>
> let me know if that makes sense to anyone here...
>
> Regards,
>
> Mario.
>


How to extend the timeline server schema to accommodate business metadata

2020-05-31 Thread Mario de Vera
I see a need for extending the current timeline server schema so that a 
flexible model could be achieved in order to accommodate business metadata.

let me know if that makes sense to anyone here...

Regards,

Mario.