hey Vinoth, I noticed you added this suggestion to the weekly log .. that is great ! just let me know if I am able to create a JIRA , as I tried to go to HUDI project in Apache and did not find a way to do it. I can bring in a good description of the benefits etc...
thanks, Mario. Em seg., 8 de jun. de 2020 às 12:46, Vinoth Chandar <vin...@apache.org> escreveu: > Hi, > > We can probably make a new JIRA. Not sure if there is an existing JIRA to > re-use. > The Following modules are good to look at. > > hudi-timeline-service > packaging/hudi-timeline-server-bundle > > Thanks > Vinoth > > On Fri, Jun 5, 2020 at 12:56 AM Mario de Sá Vera <desav...@gmail.com> > wrote: > > > Sorry Vinoth for not being clear... If that is a work in progress would > you > > have a jira I could follow up and contribute to ? If not , what is the > > module name you suggest me looking at? > > > > Regards, > > > > Mario. > > > > On Fri, 5 Jun 2020, 02:12 Vinoth Chandar, <vin...@apache.org> wrote: > > > > > Sorry did not understand the last part. :) are you suggesting we > create a > > > jira > > > > > > On Thu, Jun 4, 2020 at 1:08 AM Mario de Sá Vera <desav...@gmail.com> > > > wrote: > > > > > > > That sounds great ! Will check that and keep an eye on the long > running > > > > server approach... once it gets a ticket I could watch for just let > me > > > know > > > > please. > > > > > > > > Thanks > > > > > > > > > > > > On Thu, 4 Jun 2020, 05:34 Vinoth Chandar, <vin...@apache.org> wrote: > > > > > > > > > Hi Mario, > > > > > > > > > > We actually started with the idea of making the timeline server, a > > long > > > > > running service. We have a module if you notice that builds our a > > > bundle > > > > > that you could deploy. May be you can play with it and see if that > > > sounds > > > > > interesting to you. It will definitely have some rough edges given > > it’s > > > > not > > > > > been widely used. > > > > > > > > > > Thanks > > > > > Vinoth > > > > > > > > > > On Wed, Jun 3, 2020 at 2:33 AM Mario de Sá Vera < > desav...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hi Vinoth, thanks for your comments on this. I spent sometime > > > thinking > > > > > over > > > > > > another possibility which would be externalising the Hudi > timeline > > > > > service > > > > > > itself to an external server holding both operational (ie Hudi) > and > > > > > > business metadata. > > > > > > > > > > > > would you guys have any opinion on that ? would that be easy as I > > do > > > > not > > > > > > seem to see a way yet , except reading about RocksDB but that is > > > still > > > > > not > > > > > > quite clear. > > > > > > > > > > > > best regards, > > > > > > > > > > > > Mario. > > > > > > > > > > > > Em seg., 1 de jun. de 2020 às 16:01, Vinoth Chandar < > > > > > > mail.vinoth.chan...@gmail.com> escreveu: > > > > > > > > > > > > > Hi Mario, > > > > > > > > > > > > > > Thanks for the detailed explanation. Hudi already allows extra > > > > metadata > > > > > > to > > > > > > > be written atomically with each commit i.e write operation. In > > > fact, > > > > > that > > > > > > > is how we track checkpoints for our delta streamer tool.. It > may > > > not > > > > > > solve > > > > > > > the need for querying the data together with this information. > > but > > > > > gives > > > > > > > you ability to do some basic tagging.. if thats useful > > > > > > > > > > > > > > >>If we enable the timeline service metadata model to be > extended > > > we > > > > > > could > > > > > > > use the service instance itself to support specialised queries > > that > > > > > > involve > > > > > > > business qualifiers in order to return a proper set of metadata > > > > > pointing > > > > > > to > > > > > > > the related commits > > > > > > > > > > > > > > This is a good idea actually.. There is another active discuss > > > thread > > > > > on > > > > > > > making the metadata queryable.. there is also > > > > > > > https://issues.apache.org/jira/browse/HUDI-309 which we paused > > for > > > > > now.. > > > > > > > But that's more in line with what you are thinking IIUC > > > > > > > > > > > > > > > > > > > > > Thanks > > > > > > > vinoth > > > > > > > > > > > > > > On Mon, Jun 1, 2020 at 4:41 AM Mario de Sá Vera < > > > desav...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Balaji, > > > > > > > > > > > > > > > > business metadata are all types of info related to the > business > > > > where > > > > > > the > > > > > > > > Hudi solution is being used... from a COB (ie close of > business > > > > date) > > > > > > > > related to that commit to any qualifier related to that > commit > > > that > > > > > > might > > > > > > > > be useful to be associated with that commit id. If we enable > > the > > > > > > timeline > > > > > > > > service metadata model to be extended we could use the > service > > > > > instance > > > > > > > > itself to support specialised queries that involve business > > > > > qualifiers > > > > > > in > > > > > > > > order to return a proper set of metadata pointing to the > > related > > > > > > commits > > > > > > > > that answer a business query. > > > > > > > > > > > > > > > > if we do not have that flexibility we might end up creating a > > > > > external > > > > > > > > transaction log and then comes the hard task to make that > > service > > > > in > > > > > > sync > > > > > > > > to the timeline service. > > > > > > > > > > > > > > > > let me know if that makes sense to you, > > > > > > > > > > > > > > > > Mario. > > > > > > > > > > > > > > > > Em seg., 1 de jun. de 2020 às 06:55, Balaji Varadarajan > > > > > > > > <v.bal...@ymail.com.invalid> escreveu: > > > > > > > > > > > > > > > > > Hi Mario, > > > > > > > > > Timeline Server was designed to serve hudi metadata for > Hudi > > > > > writers > > > > > > > and > > > > > > > > > readers. it may not be suitable to serve arbitrary data. > > But, > > > it > > > > > is > > > > > > an > > > > > > > > > interesting thought. Can you elaborate more on what kind of > > > > > business > > > > > > > > > metadata are you looking. Is this something you are > planning > > to > > > > > store > > > > > > > in > > > > > > > > > commit files ? > > > > > > > > > Balaji.V > > > > > > > > > > > > > > > > > > On Sunday, May 31, 2020, 04:22:27 PM PDT, Mario de Sá > > Vera > > > < > > > > > > > > > desav...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > I see a need for extending the current timeline server > > schema > > > so > > > > > > that > > > > > > > a > > > > > > > > > flexible model could be achieved in order to accommodate > > > business > > > > > > > > metadata. > > > > > > > > > > > > > > > > > > let me know if that makes sense to anyone here... > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > > > Mario. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >