Re: [ANNOUNCE] Hudi Community Weekly Update (2020-05-31 ~ 2020-06-07)

2020-06-08 Thread Yixue Zhu
On Sun, Jun 7, 2020, 8:23 AM leesf  wrote:

> Dear community,
>
> Happy to share Hudi community weekly update for 2020-05-31 ~ 2020-06-07
> with updates on discussion, bug fix and tests.
>
> ===
> Discussion
>
> [Release] Hudi 0.5.3 release candidate #1 had been cut, but cancelled since
> an issue needs to be ported to 0.5.3. [1]
> [TLP] ASF announced that Hudi became TLP. [2]
> [Hudi Usage] A discussion about extending the timeline server schema to
> accommodate business metadata. [3]
>
>
> ===
> Bugs
>
> [Hive Integration] Add processing logic for the decimal LogicalType. [4]
> [Common Core] Timeline API : filterCompletedAndCompactionInstants needs to
> handle requested state correctly [5]
>
>
> ===
> Tests
>
> [Test] https://jira.apache.org/jira/browse/HUDI-975 [6]
> [Test] Restructure test packages in hudi-client/cli [7]
> [Test] Fix unit test flakiness in Hudi [8]
>
>
>
> [1]
>
> https://lists.apache.org/thread.html/r17f8e26528fd06b9bbb0c0f21bb856e3632b926b8693c2d5322cad55%40%3Cdev.hudi.apache.org%3E
> [2]
>
> https://lists.apache.org/thread.html/r51b94588870072b878e05f74d4350db5b727f71d7263a7630c059962%40%3Cdev.hudi.apache.org%3E
> [3]
>
> https://lists.apache.org/thread.html/0af7abc978b8068fb38843c6e216c7a6e7b172f7a9c80c5bbaed9ca8%40%3Cdev.hudi.apache.org%3E
> [4] https://jira.apache.org/jira/browse/HUDI-934
> [5] https://jira.apache.org/jira/browse/HUDI-990
> [6] https://jira.apache.org/jira/browse/HUDI-975
> [7] https://jira.apache.org/jira/browse/HUDI-811
> [8] https://jira.apache.org/jira/browse/HUDI-988
>
>
> Best,
> Leesf
>


Re: How to extend the timeline server schema to accommodate business metadata

2020-06-08 Thread Vinoth Chandar
Hi,

We can probably make a new JIRA. Not sure if there is an existing JIRA to
re-use.
The Following modules are good to look at.

hudi-timeline-service
packaging/hudi-timeline-server-bundle

Thanks
Vinoth

On Fri, Jun 5, 2020 at 12:56 AM Mario de Sá Vera  wrote:

> Sorry Vinoth for not being clear... If that is a work in progress would you
> have a jira I could follow up and contribute to ? If not , what is the
> module name you suggest me looking at?
>
> Regards,
>
> Mario.
>
> On Fri, 5 Jun 2020, 02:12 Vinoth Chandar,  wrote:
>
> > Sorry did not understand the last part. :) are you suggesting we create a
> > jira
> >
> > On Thu, Jun 4, 2020 at 1:08 AM Mario de Sá Vera 
> > wrote:
> >
> > > That sounds great ! Will check that and keep an eye on the long running
> > > server approach... once it gets a ticket I could watch for just let me
> > know
> > > please.
> > >
> > > Thanks
> > >
> > >
> > > On Thu, 4 Jun 2020, 05:34 Vinoth Chandar,  wrote:
> > >
> > > > Hi Mario,
> > > >
> > > > We actually started with the idea of making the timeline server, a
> long
> > > > running service.  We have a module if you notice that builds our a
> > bundle
> > > > that you could deploy. May be you can play with it and see if that
> > sounds
> > > > interesting to you. It will definitely have some rough edges given
> it’s
> > > not
> > > > been widely used.
> > > >
> > > > Thanks
> > > > Vinoth
> > > >
> > > > On Wed, Jun 3, 2020 at 2:33 AM Mario de Sá Vera 
> > > > wrote:
> > > >
> > > > > Hi Vinoth, thanks for your comments on this. I spent sometime
> > thinking
> > > > over
> > > > > another possibility which would be externalising the Hudi timeline
> > > > service
> > > > > itself to an external server holding both operational (ie Hudi) and
> > > > > business metadata.
> > > > >
> > > > > would you guys have any opinion on that ? would that be easy as I
> do
> > > not
> > > > > seem to see a way yet , except reading about RocksDB but that is
> > still
> > > > not
> > > > > quite clear.
> > > > >
> > > > > best regards,
> > > > >
> > > > > Mario.
> > > > >
> > > > > Em seg., 1 de jun. de 2020 às 16:01, Vinoth Chandar <
> > > > > mail.vinoth.chan...@gmail.com> escreveu:
> > > > >
> > > > > > Hi Mario,
> > > > > >
> > > > > > Thanks for the detailed explanation. Hudi already allows extra
> > > metadata
> > > > > to
> > > > > > be written atomically with each commit i.e write operation. In
> > fact,
> > > > that
> > > > > > is how we track checkpoints for our delta streamer tool.. It may
> > not
> > > > > solve
> > > > > > the need for querying the data together with this information.
> but
> > > > gives
> > > > > > you ability to do some basic tagging.. if thats useful
> > > > > >
> > > > > > >>If we enable the timeline service metadata model to be extended
> > we
> > > > > could
> > > > > > use the service instance itself to support specialised queries
> that
> > > > > involve
> > > > > > business qualifiers in order to return a proper set of metadata
> > > > pointing
> > > > > to
> > > > > > the related commits
> > > > > >
> > > > > > This is a good idea actually.. There is another active discuss
> > thread
> > > > on
> > > > > > making the metadata queryable.. there is also
> > > > > > https://issues.apache.org/jira/browse/HUDI-309 which we paused
> for
> > > > now..
> > > > > > But that's more in line with what you are thinking IIUC
> > > > > >
> > > > > >
> > > > > > Thanks
> > > > > > vinoth
> > > > > >
> > > > > > On Mon, Jun 1, 2020 at 4:41 AM Mario de Sá Vera <
> > desav...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Balaji,
> > > > > > >
> > > > > > > business metadata are all types of info related to the business
> > > where
> > > > > the
> > > > > > > Hudi solution is being used... from a COB (ie close of business
> > > date)
> > > > > > > related to that commit to any qualifier related to that commit
> > that
> > > > > might
> > > > > > > be useful to be associated with that commit id. If we enable
> the
> > > > > timeline
> > > > > > > service metadata model to be extended we could use the service
> > > > instance
> > > > > > > itself to support specialised queries that involve business
> > > > qualifiers
> > > > > in
> > > > > > > order to return a proper set of metadata pointing to the
> related
> > > > > commits
> > > > > > > that answer a business query.
> > > > > > >
> > > > > > > if we do not have that flexibility we might end up creating a
> > > > external
> > > > > > > transaction log and then comes the hard task to make that
> service
> > > in
> > > > > sync
> > > > > > > to the timeline service.
> > > > > > >
> > > > > > > let me know if that makes sense to you,
> > > > > > >
> > > > > > > Mario.
> > > > > > >
> > > > > > > Em seg., 1 de jun. de 2020 às 06:55, Balaji Varadarajan
> > > > > > >  escreveu:
> > > > > > >
> > > > > > > >  Hi Mario,
> > > > > > > > Timeline Server was designed to serve hudi metadata for Hudi
> > > > writers
> > > > > > and
> > > > > > > > readers.  it 

Re: [DISSCUSS] Trigger a Travis-CI rebuild without pushing a commit

2020-06-08 Thread Vinoth Chandar
Thanks Lamber-ken!  Please give me sometime to understand the PR and
review..

if anyone can jump on this in the meantime, please go ahead!

On Fri, Jun 5, 2020 at 3:33 PM Lamber Ken  wrote:

> Hi Vinoth,
>
> Based on the discussion above, I came up with an interesting idea that
> introduce a robot to build testing website automatically. If folks don't
> want to build staging site by themself, we can introduce a rebot which
> build testing website and push site automatically.
>
> I have already raised a lira for this:
> https://github.com/apache/hudi/pull/1706
> https://issues.apache.org/jira/browse/HUDI-998
>
> Thanks,
> Lamber-Ken
>
> On 2020/06/01 15:14:32, Vinoth Chandar  wrote:
> > Great!  I left some comment on the PR. around licensing and maintenance
> > overhead.
> >
> > On Sun, May 31, 2020 at 11:51 PM Lamber Ken 
> wrote:
> >
> > > Hi forks,
> > >
> > > Learned from travis and github actions api docs these days, I used my
> > > project as a demo[1],
> > > the demo pull request will always fail, please use "rerun tests"
> command,
> > > it will rerun tests automatically.
> > >
> > > if you are interested, try it.
> > >
> > > Best,
> > > Lamber-Ken
> > >
> > > [1] https://github.com/lamber-ken/hdocs/pull/36
> > >
> > >
> > > On 2020/05/27 06:08:05, Lamber Ken  wrote:
> > > > Dear community,
> > > >
> > > > Use case: A build fails due to an externality. The source is actually
> > > correct. It would build OK and pass if simply re-run. Is there some
> way to
> > > nudge Travis-CI to do another build, other than pushing a "dummy"
> commit?
> > > >
> > > > The way I often used is `git commit --allow-empty -m 'trigger
> rebuild'`,
> > > push a dummy commit, the travis will rebuild. Also noticed some apache
> > > projects have supported this feature.
> > > >
> > > > For example:
> > > > 1. Carbondata use "retest this please"
> > > > https://github.com/apache/carbondata/pull/3387
> > > >
> > > > 2. Bookkeeper use "run pr validation"
> > > > https://github.com/apache/bookkeeper/pull/2158
> > > >
> > > > But, I can't find a effective solution from Github and Travis's
> > > documentation[1], any thoughts or opinions?
> > > >
> > > > Best,
> > > > Lamber-Ken
> > > >
> > > > [1] https://docs.travis-ci.comhttps://support.github.com
> > > >
> > >
> >
>


Re: Apply for Confluence

2020-06-08 Thread Vinoth Chandar
Done. Welcome aboard!

On Mon, Jun 8, 2020 at 1:00 AM 李 天烨  wrote:

> Hi,
>
> I want to contribute to Apache Hudi.
> Would you please give me the contributor permission?
> My Confluence ID is litianye , email is litiany...@outlook.com.
>


Apply for Confluence

2020-06-08 Thread 李 天烨
Hi,

I want to contribute to Apache Hudi.
Would you please give me the contributor permission?
My Confluence ID is litianye , email is litiany...@outlook.com.


Apply for JIRA

2020-06-08 Thread 李 天烨
Hi,

I want to contribute to Apache Hudi.
Would you please give me the contributor permission?
My JIRA ID is Litianye.