Re: [VOTE] Release 0.6.0, release candidate #1

2020-08-22 Thread Bhavani Sudha
Thank you all. Closing the voting as we have got sufficient votes. Will send out tally in a separate email. On Sat, Aug 22, 2020 at 8:34 PM Shiyan Xu wrote: > Submitted the PR to update testing commands > https://github.com/apache/hudi/pull/2010 > > +1 (non-binding) > > - packaging ok > -

Re: [VOTE] Release 0.6.0, release candidate #1

2020-08-22 Thread Udit Mehrotra
+1 (non-binding) - Compiles successfully - Ran tests on EMR with bunch of upserts/delete commits, and verified query results through spark datasource, spark-sql, hive and presto for COW/MOR tables - Ran insert/bulk insert/upserts on 100GB tpcds table - Ran release validation scripts successfully

Re: [VOTE] Release 0.6.0, release candidate #1

2020-08-22 Thread Bhavani Sudha
+1 (binding) Downloaded tar and verified compile [OK] Run integration test locally. [OK] Run a few tests in IDE. [OK] Run quickstart [OK] Verify NOTICE and LICENSE exists [OK] Check Checksum [OK] Check no Binary files in source release [OK] Rat Check Passed [OK] On Sat, Aug 22, 2020 at

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-22 Thread Shiyan Xu
It can be up to the individual to use the IDE formatter or not, as long as there is a tool to help enforce Checkstyle rules. For people who use IDE formatter, importing Checkstyle.xml as a format scheme does not fully control the formatter's behavior, that's why IDE sometimes gets in the way. But

Re: [VOTE] Release 0.6.0, release candidate #1

2020-08-22 Thread Balaji Varadarajan
+1(binding) 1. Ran long running structured streaming writes on fake data and verified compactions and ingestion is happening without errors. 2. Ran both scala and python based quickstart without any errors. There was an issue in the documented quickstart steps (not in hudi) for python example.

Re: [VOTE] Release 0.6.0, release candidate #1

2020-08-22 Thread Vinoth Chandar
+1 (binding) - Ran the rc checks, I typically do - Tested a smoke test on both cow, mor tables - by running lot commits over longer period of time, - verifying the state of the dataset - count validation match. On Sat, Aug 22, 2020 at 6:08 AM leesf wrote: > +1 (binding) > - mvn

Re: [DISCUSS] Support for `_hoodie_record_key` as a virtual column

2020-08-22 Thread Sivabalan
Aah, yes. That’s right. On Sat, Aug 22, 2020 at 2:43 AM Vinoth Chandar wrote: > All of the remaining meta fields compress very very nicely. They have > > almost no overhead. > > > > On Fri, Aug 21, 2020 at 12:00 PM Abhishek Modi > > wrote: > > > > > @sivabalan the current plan is to only add

Re: [VOTE] Release 0.6.0, release candidate #1

2020-08-22 Thread leesf
+1 (binding) - mvn clean package -DskipTests OK - ran quickstart guide OK (still get the exception ERROR view.PriorityBasedFileSystemView: Got error running preferred function. Trying secondary org.apache.hudi.exception.HoodieRemoteException: 192.168.1.102:56544 failed to respond at

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-22 Thread vino yang
Hi vc, Yes, this part of the practice may have different preferences for different developers. I have never opened the IDE's automatic formatting, nor have I used the IDE's formatting functions artificially. Because I have participated in multiple open source communities, each open source

Re: Incremental query on partition column

2020-08-22 Thread David Rosalia
Good moring Balaji, Vinoth, Thank you both for your replies. I agree that this is a topic that should come up more often and I am surprised that so little is said about this. The option B in your mail (writing the delete marker also in the historical records) sounds like a good option, but

Re: [VOTE] Release 0.6.0, release candidate #1

2020-08-22 Thread Gary Li
Thanks Raymond. The tests ran successfully with these commands. Best Regards, Gary Li On 8/21/20, 10:18 PM, "Shiyan Xu" wrote: I should have documented this...(which I will soon) When run from terminal, could you please try running with maven profile like `mvn -Punit-tests

Re: [DISCUSS] Support for `_hoodie_record_key` as a virtual column

2020-08-22 Thread Vinoth Chandar
All of the remaining meta fields compress very very nicely. They have almost no overhead. On Fri, Aug 21, 2020 at 12:00 PM Abhishek Modi wrote: > @sivabalan the current plan is to only add this for hoodie_record_key. But > I'm hoping to make the implementation general enough to add other

Re: Incremental query on partition column

2020-08-22 Thread Vinoth Chandar
Hi David, Thanks for the detailed email. and apologies for the sudden break in communication. >We wanted to manipulate the commit times to rebuild the history. yes. best not to try and change the commit times/history. >- replay the data omitting the data of the persons who have requested to be

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-22 Thread Vinoth Chandar
>But, IMO, we can ignore the IDE here, if it breaks the code style, checkstyle will stop building and spotless will work. I differ here slightly. Most people reformat code using the "format code" in the IDE. And IDEs also can reorganize the code when you save etc. We need a solid way to not be