+1
On 2020/04/28 05:05:44, Vinoth Chandar wrote:
> Hello all,
>
> I would like to start a discussion on our readiness to pursue graduation to
> TLP and potentially follow up with a VOTE with a formal resolution. To seed
> the discussion, our community's achievements since entering the
xed!
>
> On Thu, Apr 23, 2020 at 11:57 AM lamberken wrote:
>
> > Hi Vinoth,
> >
> > The browser shown hudi.xml contains syntax error.
> >
> > https://svn.apache.org/repos/asf/incubator/public/trunk/content/projects/hudi.xml
> >
> > Best,
> > Lamb
Hi Vinoth,
The browser shown hudi.xml contains syntax error.
https://svn.apache.org/repos/asf/incubator/public/trunk/content/projects/hudi.xml
Best,
Lamber-Ken
On 2020/04/23 16:51:49, Vinoth Chandar wrote:
> Finally figured out.. :/ Updated the status file now, to reflect latest
> information
Hi team,
Many users use slack ask for support when they met bugs / problems currently.
but there are some disadvantages we need to consider:
1. code snippet display is not friendly.
2. we may miss some questions when questions come up at the same time.
3. threads cann't be indexed by
ntent" folder. Our plan
>> is to try this for 1 week and then actually cut over to "content" which is
>> what powers the site.
>>
>> Kudos to lamber-ken for the perseverance in getting this done!
>>
>> On Tue, Mar 24, 2020 at 5:19 PM lamberken wr
Hi team,
After HUDI-504[1] landed, travis will build asf-site branch and update site
automatically,
developers can focus on add/edit/remove *.md files, will don't need to learn
about how to build site.
Fell free to report any issues if you see, thanks very much.
[1]
Tue, Mar 10, 2020 at 2:57 AM lamberken wrote:
>
>>
>>
>> hi,
>>
>>
>> IMO, when upsert 150K record with 100columns, these records need
>> serializate to disk and deserialize from disk.
>> You can try add < option("hoodie.memory.merge.max.size&
More, we improve the performance issuse around DiskBasedMap & kryo at master
branch.
You also can try build hudi jar use master branch.
best,
lamber-ken
At 2020-03-10 17:07:58, "selvaraj periyasamy"
wrote:
Sorry for the partial emails. My company portal don’t allow me to add test code
hi,
IMO, when upsert 150K record with 100columns, these records need serializate to
disk and deserialize from disk.
You can try add < option("hoodie.memory.merge.max.size", "200485760") >
best,
lamber-ken
At 2020-03-10 17:07:58, "selvaraj periyasamy"
wrote:
Sorry for the
by a flag..
>
>Also let's separate the RDD vs DataFrame discussion out of this? Since that
>orthogonal anyway..
>
>Thanks
>Vinoth
>
>
>On Fri, Feb 28, 2020 at 11:02 AM lamberken wrote:
>
>>
>>
>> Hi vinoth,
>>
>>
>> Thanks for revie
e.. i.e
>repartitionAndSortWithinPartitions() the input records to mergehandle, and
>if the file is also sorted on disk (its not today), then we can do a
>merge_sort like algorithm to perform the merge.. We can probably write code
>to bear one time sorting costs... This will eliminate the nee
e RDD based spark operations.
>
>Are you suggesting that we perform the merging in sql? Not following.
>Please clarify.
>
>On Wed, Feb 26, 2020 at 10:08 AM lamberken wrote:
>
>>
>>
>> Hi guys,
>>
>>
>> Motivation
>> Impove the merge performanc
Hi guys,
Motivation
Impove the merge performance for cow table when upsert, handle merge operation
by using spark built-in operators.
Background
When do a upsert operation, for each bucket, hudi needs to put new input
elements to memory cache map, and will
need an external map that spills
> Please go ahead and make the change @lamberken
>
>I was just looking at scripts from Hive and Kafka projects, see below.
>
>https://github.com/apache/hive/blob/master/bin/init-hive-dfs.sh
>https://github.com/apache/hive/blob/master/bin/hive-config.sh
>
>https://github.co
g in the hope that it works i.e
>spark does not change types etc often and spark-avro interplays.
>But let's have a flag in maven to skip this bundling if need be.. We should
>doc his clearly on the build instructions in the README?
>
>What do others think?
>
>
>
>On
eir own hudi jars,
>>> with the spark version they use. It avoid the compatibility issues
>>>
>>> between user's local jars and pre-built hudi spark version(2.4.4).
>>>
>>> Or can remove "org.apache.spark:spark-avro_2.11:2.4.4"? Because user
>&g
Congratulations to Leesf, Vino Yang and Siva, +1 very well deserved :)
Best,
Lamber-Ken
在 2020-02-15 12:58:27,"vino yang" 写道:
>Thanks, folks. It's a great honor.
>
>Hudi community is great! Let us continue to make Hudi better.
>
>Best,
>Vino
>
>Noway <957029...@qq.com> 于2020年2月15日周六
Hi @bhasudha,
No need to say sorry, I think this discussion is meaningful hudi project.
Thanks,
Lamber-Ken
At 2020-02-06 07:07:49, "Bhavani Sudha" wrote:
>Hi @lamberken Sorry I missed to see this earlier. I also left this comment
>in the PR. I think Vinoth brings
Dear team,
With the 0.5.1 version released, user need to add
`org.apache.spark:spark-avro_2.11:2.4.4` when starting hudi command, like bellow
Hi @Vinoth @Vino
IMO, we can use SonarQube[1] and Sonarlint[2] tools to help us detect and fix
quality issues.
Local env, follow below steps:
--
1, docker run -d --name sonarqube -p 9000:9000 sonarqube
Thanks @Vino :)
Thanks
Lamber-Ken
At 2020-01-24 10:18:28, "vino yang" wrote:
>Hi Lamber,
>
>+1 from my side.
>
>Best,
>Vino
>
>lamberken 于2020年1月24日周五 上午7:11写道:
>
>>
>>
>> Thanks you all. :)
>>
>>
>> Hi @nishit
--- Original --
>> > From:"Balaji Varadarajan"> > Date:Fri, Jan 24, 2020 01:21 AM
>> > To:"dev"> >
>> > Subject:Re: [DISCUSS] Redraw of hudi data lake architecture diagram
>> > on langing page
>&g
Thanks @Balaji.V and @Vinoth.
At 2020-01-24 01:21:51, "Balaji Varadarajan" wrote:
> +1 as well. Looks great.
>Balaji.V
>On Thursday, January 23, 2020, 08:17:47 AM PST, Vinoth Chandar
> wrote:
>
> Looks good . +1 !
>
>On Wed, Jan 22,
Hello everyone,
I redrawed the hudi data lake architecture diagram on landing page. If you have
time, go ahead with hudi website[1] and test site[2].
Any thoughts are welcome, thanks very much. :)
[1] https://hudi.apache.org
[2] https://lamber-ken.github.io
Thanks
Lamber-Ken
at, Jan 18, 2020 at 2:10 PM Vinoth Chandar wrote:
>
>> Hello all,
>>
>> I am looking at doing this, so we can preserve the 0.5.0 release docs for
>> users who can't move to 0.5.1. Any suggestions? esp lamberken?
>>
>> I tried adding a subfolder under _docs in the hope that it will get picked
>> up and new html generated.. but does not seem to work.
>>
>> Thanks
>> Vinoth
>>
full version to introduce the design and
>architecture of HUDI has been written[1], and you are welcome to
>contribute.
>[JDBC Incremental Puller] A disscussion about introducing JDBC Delta
>Streamer to make HUDI more powerful[2] has been started. and a RFC[3] has
>been draft for comments.
>
>
>On Tue, Jan 7, 2020 at 6:58 PM lamberken wrote:
>
>>
>>
>> Hi @Vinoth,
>>
>>
>> It's time to pick up this topic. Based on the content we talked about,
>> here are my thoughts
>>
>>
>> 1, Initial proposal aims to rework
er-ken Great to hear that you've led the comms with ApacheCN! Let me
>know if any help is needed. I'm also willing to help the translation work.
>
>On Wed, Jan 8, 2020 at 3:58 PM lamberken wrote:
>
>> Hello @Sudha,
>>
>>
>> You are welcome, no need to say sor
thread after
>vacation.
>@lamber-ken The new site looks cool. Thanks for the time and effort you
>have put into this.
>
>Thanks,
>Sudha
>
>
>
>On Tue, Jan 7, 2020 at 11:45 PM lamberken wrote:
>
>>
>>
>> Hi @Y Ethan Guo,
>>
>>
>> Than
n/writing_data.html). So if it's not hard to port
>them to the new website, they are still useful for the users.
>
>Best,
>- Ethan
>
>On Tue, Jan 7, 2020 at 11:05 PM lamberken wrote:
>
>>
>>
>> Hi @Y Ethan Guo,
>>
>>
>> Thank you very much for y
lls to the end of the page. IMHO, one point
>smaller might be better.
>
>- Ethan
>
>On Tue, Jan 7, 2020 at 3:11 PM lamberken wrote:
>
>>
>>
>> Hi Pratyaksh Sharma,
>>
>>
>> Good catch!
>>
>> Best,
>> Lamber-ken
>>
>&
-12-19 11:05:16, "Vinoth Chandar" wrote:
>Sounds good.. This scoped down version per se, does not need a RFC.
>
>On Wed, Dec 18, 2019 at 3:09 PM lamberken wrote:
>
>>
>>
>> Hi @Vinoth
>>
>>
>> I understand what you mean, I will cont
go, as with any code
>change
>
>Please share the PR here once you have it
>
>Thanks
>Vinoth
>
>On Fri, Dec 20, 2019 at 3:55 PM lamberken wrote:
>
>>
>>
>> Hi leesf,
>>
>>
>> Thank you for your affirmation.
>>
>>
>> b
s around jars, seem
>like regressions that the hudi-utilities is not a fat jar anymore?
>
>if there are nt any takers, I can also try my hand at fixing this, once I
>get done with few things on my end. left a comment on HUDI-485
>
>
>
>On Tue, Dec 31, 2019 at 4:19 PM
Hi @Pratyaksh Sharma,
Okay, all right. BTW, thanks for raising this issue.
best,
lamber-ken
On 01/2/2020 13:47,Pratyaksh Sharma wrote:
Hi Lamberken,
I am also trying to fix this issue. Please let us know if you come up with
anything.
On Thu, Jan 2, 2020 at 11:12 AM lamberken wrote
, Jan 1, 2020 at 8:57 PM lamberken wrote:
Hi @Vinoth,
I'm willing to solve this problem. I'm trying to find out from the history
when hudi-utilities-bundle becoming not a fatjar.
Git History
2019-08-29 FAT-JAR ---> 5f9fa82f47e1cc14a22b869250fe23c8f9c033cd
2019-09-14 NOT-FAT
ions that the hudi-utilities is not a fat jar anymore?
>
>if there are nt any takers, I can also try my hand at fixing this, once I
>get done with few things on my end. left a comment on HUDI-485
>
>
>
>On Tue, Dec 31, 2019 at 4:19 PM lamberken wrote:
>
>>
>>
>
, I have replaced it with hubi-spark-bundle-0.5.0-incubating.jar,
>> and the program seems to be stable.
>>
>> --
>> ma...@bonc.com.cn
>>
>>
>> *发件人:* lamberken
>> *发送时间:* 2019-12-24 11:24
>> *收件人:* dev
>> *主题:
this issue, could you try reproducing this on the
>>> docker setup
>>>
>>> https://hudi.apache.org/docker_demo.html#step-7--incremental-query-for-copy-on-write-table
>>> similar to this and raise a JIRA.
>>> Happy to look into it and get it fixed if neede
/f7834b3389e67b2b66b65386f59eb6646942206865133300c0416a6a%40%3Cdev.hudi.apache.org%3E
best,
lamber-ken
On 12/27/2019 21:02,Shahida Khan wrote:
@lamberken, when I have checked, folder .aux was empty ...
:(
On Fri, 27 Dec 2019 at 6:28 PM, lamberken wrote:
Hi @Shahida Khan,
I have a question that the size of *.clean.requested files is 0
Hi @Shahida Khan,
I have a question that the size of *.clean.requested files is 0 ?
best,
lamber-ken
On 12/27/2019 19:54,Shahida Khan wrote:
Hi,
Greetings!!
I have currently using Delta Streamer and upserting data via hudi in
real-time.
Have used the latest master branch.
Job was
without OOM?
ma...@bonc.com.cn
From: lamberken
Date: 2019-12-26 15:33
To: dev@hudi.apache.org
Subject: Re:insert too slow
Hi @mayu1,
Can you run the below program in cosole? looking forward to your feedback
Hi @mayu1,
Can you run the below program in cosole? looking forward to your feedback.
${SPARK_HOME}/bin/spark-shell \
--packages org.apache.hudi:hudi-spark-bundle:0.5.0-incubating \
--conf
Hi,
It's because there are spaces in the basePath, can you try to use this?
val basePath = "file:///tmp/hudi_cow_table”
best,
lamber-ken
On 12/25/2019 20:50,965147...@qq.com<965147...@qq.com> wrote:
hi,all
The environment I use is CDH6.3,
Use hadoop3 maven dependency to compile hudi,
on.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
BWT, I'm n
Hi @Vinoth
Okay, I will talk with @leesf about the checkstyle. At the end we will give a
relatively better solution.
best,
lamber-ken
At 2019-12-24 11:00:12, "Vinoth Chandar" wrote:
>Ironically, google style + checkstyle is what we had few months ago :)
>
>Can we have an owner to drive
Hi @Minh Pham
I agree what @Y Ethan Guo says, we can disable those checkstyle rules which can
not be automated for reformatting for now.
If we replace all new rules with google code style, it may takes more time to
fix.
best,
lamber-ken
On 12/24/2019 03:34,Minh Pham wrote:
What do you guys
;> > > > After bringing spotless plugin back to project, it would
>> automatically
>> > > fix
>> > > > comment check error except for import order error, we need to fix
>> this
>> > > > error manually. In Apache Flink/Calcite, we also fix
that it
>does not apply all rules while formatting (e.g line breaks between import
>groups).
>
>On Sun, Dec 22, 2019 at 11:36 PM lamberken wrote:
>
>> Hi Vinoth,
>>
>>
>> Here are some of my points:
>>
>>
>> 1, When developers are no
Hi Vinoth,
Here are some of my points:
1, When developers are not familiar with checkstyle rules, they feel
uncomfortable. I think its a good idea to
make the instructions on contributing guide work with the checkstyle rules we
already have.
2, We can also prompt users in the
Hi @Y Ethan Guo @Vinoth
I have some ideas for RFC-10 which aims to improve the Hudi web documentation
for
users and the process of updating docs for developers.
At the begining, I tried to learn how to realize it from other projcts, like
pulsar, druid etc.
After a period of research, I
issues there one by one.
I think a full line by line review is the best way to go, as with any code
change
Please share the PR here once you have it
Thanks
Vinoth
On Fri, Dec 20, 2019 at 3:55 PM lamberken wrote:
Hi leesf,
Thank you for your affirmation.
best,
lamber-ken
At 2019-12-21
please let lamberken/me know before using new website.
Best,
Leesf
[1] https://lamber-ken.github.io/
lamberken 于2019年12月20日周五 上午9:29写道:
Hi nishith,
Thank you for your affirmation. The content in the blue box is to help us
understand the highlighted content.
It is different from the body cont
Hi everyone,
I finished the rework of the new UI, if you have time, please visit the
website[1].
Any questions are welcome.
[1]https://lamber-ken.github.io/docs/quick-start-guide/
best,
lamber-ken
At 2019-12-19 07:38:47, "lamberken" wrote:
>
>
>Hi @Shiyan Xu
&g
Hi @Shiyan Xu
Thanks. :)
best,
lamber-ken
At 2019-12-19 00:53:51, "Shiyan Xu" wrote:
>Thank you @lamber-ken for the work! It is definitely a greater browsing
>experience.
>
>On Tue, Dec 17, 2019 at 8:28 PM lamberken wrote:
>
>>
>> Hi, @Vinoth
>&
;
>On Fri, Dec 13, 2019 at 5:18 PM lamberken wrote:
>
>>
>>
>> Hi, @vinoth
>>
>>
>> Okay, I see. If we don't want existing users to do any upgrading or
>> reconfigurations, then this refactor work will not make much sense.
>> This issue can be clos
>For content you read sequentially, it matters less. I agree..
>
>BTW the new site looks very sleek.. :)
>
>
>
>On Tue, Dec 17, 2019 at 4:50 PM lamberken wrote:
>
>>
>> hi, allOne more thing that is missing.In the new UI, I put a "BACK TO TOP"
>> butto
hi, allOne more thing that is missing.In the new UI, I put a "BACK TO TOP"
button at the bottom of all pages to help us back to top.
We can also discuss whether we need the right navigation at the community
meeting today.best,
lamber-ken
At 2019-12-18 08:41:49, "la
( I'd rather change just the
>look-and-feel and evolve the content from there, per usual means)
>- Can we keep the theming blue and white, like now, since it gels well with
>the logo and images.
>
>
>On Mon, Dec 16, 2019 at 8:02 AM lamberken wrote:
>
>>
>>
>>
improve the site. Will review closely and
>> get
>> > back to you.
>> >
>> > On Sun, Dec 15, 2019 at 11:02 AM lamberken wrote:
>> >
>> > >
>> > >
>> > > Hello, everyone.
>> > >
>> > >
>> >
Hello, everyone.
Compare to the web site of Delta Lake[1] and Apache Iceberg[2], they may looks
better than hudi project[3].
I delved into our web ui and try to improve it, I learned that hudi web ui is
based on jekyll-doc[4] theme
which is not active. So it needs us to find a new active
inoth Chandar" wrote:
>Hi,
>
>Are you saying these classes needs to change? If so, understood. But are
>you planning on renaming configs or relocating them? We dont want existing
>users to do any upgrading or reconfigurations
>
>On Fri, Dec 13, 2019 at 10:28 AM lamberken wrote
e) needs to change due to this.
>
>On Wed, Dec 11, 2019 at 7:18 PM lamberken wrote:
>
>>
>>
>> Hi, @vinoth
>>
>>
>> 1, Hoodie*Config classes are only used to set default value when call
>> their build method currently.
>> They will be replaced by Hood
just the changes pertaining to new check
>style rules across entire repo) whenever a new change is made to check
>style ? I rebased with latest and in order to get my build pass, I have
>already fixed like 20 files and the list keeps growing.
>
>
>
>
>
>
>
>
Hi, @Sivabalan
The new ImportOrder rule split import statements into groups and groups are
separated by one blank line.
These groups are 1) org.apache.hudi 2) third party imports 3) javax 4)
java 5) static
For example
most critical question from both me and balaji.
>
>On Wed, Dec 11, 2019 at 11:35 AM lamberken wrote:
>
>> hi, @Sivabalan
>>
>> Yes, thanks very much for help me explain my initial proposal.
>>
>>
>> Answer your question, we can call HoodieWriteConfi
e changes that breaks existing behavior or
>introduces significantly complex new features.. If you are just planning to
>do the refactoring into ConfigOption class, per se you don't need a RFC.
>But , if you plan to address the fallback keys (or) your changes are going
>to break/change exi
Hi, all
Currently, many configuration items and their default values are dispersed in
the config file like HoodieWriteConfig. It’s very confused for developers, and
it's easy for developers to use them in a wrong place especially when there are
more and more configuration items. If we can
Sun, Dec 8, 2019 at 10:03 PM lamberken wrote:
In addition, we can use some tags to mark these issues, like "question", "bug",
"new feature". we can solve these bug firstly.
Best,
lamber-ken
At 2019-12-09 13:43:38, "lamberken" wrote:
>
>
e the
>checkstyle rule level.
>
>WDYT? Is it what you prepare to do?
>
>Best,
>Vino
>
>
>lamberken 于2019年12月6日周五 下午2:39写道:
>
>> Hi,
>>
>>
>> Currently, the level of scala codestyle rule is warning, it's better check
>> these rules one by
Hi,
Currently, the level of scala codestyle rule is warning, it's better check
these rules one by one
and refactor scala codes then now.
Furthermore, in order to sync to java codestyle, needs to add two rules. One is
BlockImportChecker
which allows to ensure that only single imports are
from hudi-client dependency in hudi-utilities
>pom.
>
>In version 9.4.15, Container class is an interface while in lower version
>7.6.0, it is a class. This conflict of versions was causing mentioned
>exception for me.
>
>On Fri, Nov 29, 2019 at 5:37 PM Pratyaksh Sharma
Hi, Pratyaksh Sharma
In order to solve the problem better, please provides the running environment.
For example, what is your operating system or the java version and so on. thanks
At 2019-11-29 18:15:45, "Pratyaksh Sharma" wrote:
>Hi,
>
>Every time I try to run test cases of
Thanks,
I don't know there is a JIRA for this already(HUDI-233). We can talk about it
under that issue.
Best,
lamberken
At 2019-11-26 03:09:40, "Vinoth Chandar" wrote:
>Hi,
>
>Its log4j actually across the board. (I think there are a couple files that
>have no
("The job needs to copy %d partitions.",
partitions.size())); |
So, I suggest migrate from log4j to slf4j, what dou you think?
Best,
lamberken
src/main/java
>> > | grep "public class" | wc -l
>> > 274
>> > 17:54:50 [incubator-hudi]$ grep -R -B 1 "public class"
>> hudi-*/src/main/java
>> > | grep "*/" | wc -l
>> > 178
>> > 17:55:06
+1, it’s a hard work but meaningful.
| |
lamberken
IT
|
|
ly.com
lamber...@163.com
|
签名由网易邮箱大师定制
On 11/19/2019 07:27,leesf wrote:
Hi vino,
Thanks for bringing ths discussion up.
+1 on all. the third one seems a bit too strict and usually requires manual
processing of the import order, but I
79 matches
Mail list logo