Re: Would not Stage source releases on dist.apache.org

2020-01-20 Thread Balaji Varadarajan
 Awesome. Let me know if you need anything else.
Balaji.V
On Monday, January 20, 2020, 11:32:08 PM PST, leesf  
wrote:  
 
 Works after using  *svn
checkout https://dist.apache.org/repos/dist/dev/incubator/hudi
 without *
*--depth=immediates*

leesf  于2020年1月21日周二 下午3:07写道:

> Hi balaji,
>
> I would not find entrypoint to create a folder under dev/incubator/hudi,
> have no permissions? Please advise. Thanks.
>
> Balaji Varadarajan  于2020年1月21日周二 下午2:14写道:
>
>>
>> Hi Leesf,
>> THe staging directories are intentionally empty. The directories
>> corresponding to 0.5.0-incubating release were deleted from staging
>> directory as the last step of the release. You can create a folder
>> "0.5.1-incubating" under dev/incubator/hudi and add the source release tar
>> balls with checksum there and commit.
>> Thanks,Balaji.V    On Monday, January 20, 2020, 09:57:27 PM PST, leesf <
>> leesf0...@gmail.com> wrote:
>>
>>  Hi all,
>>
>> I have compeleted the steps before step h(Stage source releases on
>> dist.apache.org) according to the release guide[1] , But I could not find
>> any code except KEYS in
>> https://dist.apache.org/repos/dist/dev/incubator/hudi/, so would not use
>> svn to checkout. Any suggestions?
>>
>> [1]
>>
>> https://cwiki.apache.org/confluence/display/HUDI/Apache+Hudi+%28incubating%29+-+Release+Guide
>>
>> Best,
>> Leesf
>>
>
>  

Re: Would not Stage source releases on dist.apache.org

2020-01-20 Thread leesf
Works after using  *svn
checkout https://dist.apache.org/repos/dist/dev/incubator/hudi
 without *
*--depth=immediates*

leesf  于2020年1月21日周二 下午3:07写道:

> Hi balaji,
>
> I would not find entrypoint to create a folder under dev/incubator/hudi,
> have no permissions? Please advise. Thanks.
>
> Balaji Varadarajan  于2020年1月21日周二 下午2:14写道:
>
>>
>> Hi Leesf,
>> THe staging directories are intentionally empty. The directories
>> corresponding to 0.5.0-incubating release were deleted from staging
>> directory as the last step of the release. You can create a folder
>> "0.5.1-incubating" under dev/incubator/hudi and add the source release tar
>> balls with checksum there and commit.
>> Thanks,Balaji.VOn Monday, January 20, 2020, 09:57:27 PM PST, leesf <
>> leesf0...@gmail.com> wrote:
>>
>>  Hi all,
>>
>> I have compeleted the steps before step h(Stage source releases on
>> dist.apache.org) according to the release guide[1] , But I could not find
>> any code except KEYS in
>> https://dist.apache.org/repos/dist/dev/incubator/hudi/, so would not use
>> svn to checkout. Any suggestions?
>>
>> [1]
>>
>> https://cwiki.apache.org/confluence/display/HUDI/Apache+Hudi+%28incubating%29+-+Release+Guide
>>
>> Best,
>> Leesf
>>
>
>


Re: Would not Stage source releases on dist.apache.org

2020-01-20 Thread leesf
Hi balaji,

I would not find entrypoint to create a folder under dev/incubator/hudi,
have no permissions? Please advise. Thanks.

Balaji Varadarajan  于2020年1月21日周二 下午2:14写道:

>
> Hi Leesf,
> THe staging directories are intentionally empty. The directories
> corresponding to 0.5.0-incubating release were deleted from staging
> directory as the last step of the release. You can create a folder
> "0.5.1-incubating" under dev/incubator/hudi and add the source release tar
> balls with checksum there and commit.
> Thanks,Balaji.VOn Monday, January 20, 2020, 09:57:27 PM PST, leesf <
> leesf0...@gmail.com> wrote:
>
>  Hi all,
>
> I have compeleted the steps before step h(Stage source releases on
> dist.apache.org) according to the release guide[1] , But I could not find
> any code except KEYS in
> https://dist.apache.org/repos/dist/dev/incubator/hudi/, so would not use
> svn to checkout. Any suggestions?
>
> [1]
>
> https://cwiki.apache.org/confluence/display/HUDI/Apache+Hudi+%28incubating%29+-+Release+Guide
>
> Best,
> Leesf
>


Re: Would not Stage source releases on dist.apache.org

2020-01-20 Thread Balaji Varadarajan
 
Hi Leesf,
THe staging directories are intentionally empty. The directories corresponding 
to 0.5.0-incubating release were deleted from staging directory as the last 
step of the release. You can create a folder "0.5.1-incubating" under 
dev/incubator/hudi and add the source release tar balls with checksum there and 
commit.
Thanks,Balaji.VOn Monday, January 20, 2020, 09:57:27 PM PST, leesf 
 wrote:  
 
 Hi all,

I have compeleted the steps before step h(Stage source releases on
dist.apache.org) according to the release guide[1] , But I could not find
any code except KEYS in
https://dist.apache.org/repos/dist/dev/incubator/hudi/, so would not use
svn to checkout. Any suggestions?

[1]
https://cwiki.apache.org/confluence/display/HUDI/Apache+Hudi+%28incubating%29+-+Release+Guide

Best,
Leesf
  

Would not Stage source releases on dist.apache.org

2020-01-20 Thread leesf
Hi all,

I have compeleted the steps before step h(Stage source releases on
dist.apache.org) according to the release guide[1] , But I could not find
any code except KEYS in
https://dist.apache.org/repos/dist/dev/incubator/hudi/, so would not use
svn to checkout. Any suggestions?

[1]
https://cwiki.apache.org/confluence/display/HUDI/Apache+Hudi+%28incubating%29+-+Release+Guide

Best,
Leesf


Re: about email

2020-01-20 Thread vino yang
Hi kuan,

If you want to unsubscribe the dev mailing list, please refer to this
link.[1]

Best,
Vino

[1]: http://hudi.apache.org/community.html

wangkuan <346795...@qq.com> 于2020年1月21日周二 上午10:25写道:

> Hi,
>
> I don’t want to receive the email from dev, thanks.


about email

2020-01-20 Thread wangkuan
Hi,

I don??t want to receive the email from dev, thanks.

Re: Don’t want to receive the email.

2020-01-20 Thread vino yang
Hi Qian,

If you want to unsubscribe the dev mailing list, please refer to this
link.[1]

Best,
Vino

[1]: http://hudi.apache.org/community.html

Qian Wang  于2020年1月21日周二 上午9:36写道:

> Hi,
>
> I don’t want to receive the email from dev, thanks.
>
> Best,
> Qian
>


Re: New blogs with 0.5.1 release

2020-01-20 Thread Vinoth Chandar
I would be supportive of that.. I myself don't have cycles for this atm.

In fact, we planned to replace "Activities" with a "Blog" entry at the
top..
Not sure if there an issue open. @lamber ken? (since this came up during
the site review)

On Mon, Jan 20, 2020 at 6:11 PM vino yang  wrote:

> Hi Vinoth,
>
> Thanks for sharing these features.
>
> Is there a plan to move some blogs that host in cwiki to the official
> website? Just like Flink has done[1].
>
> I have a strong motivation. The reason is that opening the cwiki site is
> *very* slow in China.
>
> Best,
> Vino
>
> [1]: https://flink.apache.org/blog/
>
> Vinoth Chandar  于2020年1月21日周二 上午7:01写道:
>
> > Hello all,
> >
> > There are couple blogs with new features on the 0.5.1 release.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/HUDI/2020/01/20/Change+Capture+Using+AWS+Database+Migration+Service+and+Hudi
> >
> >
> >
> https://cwiki.apache.org/confluence/display/HUDI/2020/01/15/Delete+support+in+Hudi
> >
> >
> > Hope you find it useful
> >
> > Thanks
> > Vinoth
> >
>


Don’t want to receive the email.

2020-01-20 Thread Qian Wang
Hi,

I don’t want to receive the email from dev, thanks.

Best,
Qian


Re: HoodieDeltaStreamerException during upsert with DeltaStreamer.sync()

2020-01-20 Thread Vinoth Chandar
Hi Venki,

Thanks for reporting this. The latest commit file seems to be empty? I am
wondering if this is happening because there was no new data to process and
the tool wrote an empty commit file..
Can you confirm if this seems to match the case?

Thanks
Vinoth


On Mon, Jan 20, 2020 at 4:00 PM Venki g  wrote:

> Correcting the link to commit file
>
> On Mon, Jan 20, 2020 at 3:50 PM Venki g  wrote:
>
> > Hi,
> >
> > I am using a spark job to upsert the incremental delta files from S3 into
> > Hudi storage using HoodieDeltaStreamer.sync() API , The incremental spark
> > job is failing with below exception
> >
> > java.lang.RuntimeException:
> > org.apache.hudi.utilities.exception.HoodieDeltaStreamerException: Unable
> to
> > find previous checkpoint. Please double check if this table was indeed
> > built via delta streamer
> > at com.emr.java.HiveDeltaStreamer.loadData(HiveDeltaStreamer.java:36)
> > at com.emr.java.HudiDataLoadJob.run(HudiDataLoadJob.java:28)
> > at com.emr.java.HiveDeltaStreamer.main(HiveDeltaStreamer.java:19)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:498)
> > at
> >
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:684)
> > Caused by:
> > org.apache.hudi.utilities.exception.HoodieDeltaStreamerException: Unable
> to
> > find previous checkpoint. Please double check if this table was indeed
> > built via delta streamer
> > at
> >
> org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:252)
> > at
> >
> org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:214)
> > at
> >
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:120)
> > at com.emr.java.HiveDeltaStreamer.loadData(HiveDeltaStreamer.java:30)
> > ... 7 more
> >
> > I found the recent commit file does not have
> > ""deltastreamer.checkpoint.key" in the commit file. I checked the second
> > last commit file and it has this key.
> >
> > Link to driver log(has delta streamer config passed and other info) -
> > https://pastebin.pl/view/raw/9606beb0
> >
> > *Link to most recent commit - https://pastebin.pl/view/raw/defc32ae
> >  *
> >
> > When this happened for the first time, I was able to rollback the latest
> > commit and loaded the data again and went past this exception. Since,
> this
> > exception has started occurring again, I would like to understand the
> issue
> > here and find the fix if any.
> >
> > Would highly appreciate any help on this.
> >
> > Thanks
> > Venkatesh
> >
>


Re: HoodieDeltaStreamerException during upsert with DeltaStreamer.sync()

2020-01-20 Thread Venki g
Correcting the link to commit file

On Mon, Jan 20, 2020 at 3:50 PM Venki g  wrote:

> Hi,
>
> I am using a spark job to upsert the incremental delta files from S3 into
> Hudi storage using HoodieDeltaStreamer.sync() API , The incremental spark
> job is failing with below exception
>
> java.lang.RuntimeException:
> org.apache.hudi.utilities.exception.HoodieDeltaStreamerException: Unable to
> find previous checkpoint. Please double check if this table was indeed
> built via delta streamer
> at com.emr.java.HiveDeltaStreamer.loadData(HiveDeltaStreamer.java:36)
> at com.emr.java.HudiDataLoadJob.run(HudiDataLoadJob.java:28)
> at com.emr.java.HiveDeltaStreamer.main(HiveDeltaStreamer.java:19)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:684)
> Caused by:
> org.apache.hudi.utilities.exception.HoodieDeltaStreamerException: Unable to
> find previous checkpoint. Please double check if this table was indeed
> built via delta streamer
> at
> org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:252)
> at
> org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:214)
> at
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:120)
> at com.emr.java.HiveDeltaStreamer.loadData(HiveDeltaStreamer.java:30)
> ... 7 more
>
> I found the recent commit file does not have
> ""deltastreamer.checkpoint.key" in the commit file. I checked the second
> last commit file and it has this key.
>
> Link to driver log(has delta streamer config passed and other info) -
> https://pastebin.pl/view/raw/9606beb0
>
> *Link to most recent commit - https://pastebin.pl/view/raw/defc32ae
>  *
>
> When this happened for the first time, I was able to rollback the latest
> commit and loaded the data again and went past this exception. Since, this
> exception has started occurring again, I would like to understand the issue
> here and find the fix if any.
>
> Would highly appreciate any help on this.
>
> Thanks
> Venkatesh
>


HoodieDeltaStreamerException during upsert with DeltaStreamer.sync()

2020-01-20 Thread Venki g
Hi,

I am using a spark job to upsert the incremental delta files from S3 into
Hudi storage using HoodieDeltaStreamer.sync() API , The incremental spark
job is failing with below exception

java.lang.RuntimeException:
org.apache.hudi.utilities.exception.HoodieDeltaStreamerException: Unable to
find previous checkpoint. Please double check if this table was indeed
built via delta streamer
at com.emr.java.HiveDeltaStreamer.loadData(HiveDeltaStreamer.java:36)
at com.emr.java.HudiDataLoadJob.run(HudiDataLoadJob.java:28)
at com.emr.java.HiveDeltaStreamer.main(HiveDeltaStreamer.java:19)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:684)
Caused by:
org.apache.hudi.utilities.exception.HoodieDeltaStreamerException: Unable to
find previous checkpoint. Please double check if this table was indeed
built via delta streamer
at
org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:252)
at
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:214)
at
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:120)
at com.emr.java.HiveDeltaStreamer.loadData(HiveDeltaStreamer.java:30)
... 7 more

I found the recent commit file does not have
""deltastreamer.checkpoint.key" in the commit file. I checked the second
last commit file and it has this key.

Link to driver log(has delta streamer config passed and other info) -
https://pastebin.pl/view/raw/9606beb0

Link to most recent commit - https://pastebin.pl/view/raw/9606beb0

When this happened for the first time, I was able to rollback the latest
commit and loaded the data again and went past this exception. Since, this
exception has started occurring again, I would like to understand the issue
here and find the fix if any.

Would highly appreciate any help on this.

Thanks
Venkatesh


New blogs with 0.5.1 release

2020-01-20 Thread Vinoth Chandar
Hello all,

There are couple blogs with new features on the 0.5.1 release.

https://cwiki.apache.org/confluence/display/HUDI/2020/01/20/Change+Capture+Using+AWS+Database+Migration+Service+and+Hudi

https://cwiki.apache.org/confluence/display/HUDI/2020/01/15/Delete+support+in+Hudi


Hope you find it useful

Thanks
Vinoth


Re: [NOTIFICATION] Code is frozen for next release(0.5.1)

2020-01-20 Thread Pratyaksh Sharma
Awesome.

On Mon, Jan 20, 2020 at 3:49 PM leesf  wrote:

> Hi all,
>
> I hereby inform you that the code is frozen now as scheduled. All blockers
> are resolved with the help of our community, special thanks to @vinoth and
> @balaji for their effort to get all blockers on land.
>
> And 0.5.1-RC1 will be sent in the next few days, thanks you.
>
> Best,
> Leesf
>


[NOTIFICATION] Code is frozen for next release(0.5.1)

2020-01-20 Thread leesf
Hi all,

I hereby inform you that the code is frozen now as scheduled. All blockers
are resolved with the help of our community, special thanks to @vinoth and
@balaji for their effort to get all blockers on land.

And 0.5.1-RC1 will be sent in the next few days, thanks you.

Best,
Leesf