[GitHub] [hudi] codejoyan edited a comment on issue #2620: [SUPPORT] Performance Tuning: Slow stages (Building Workload Profile & Getting Small files from partitions) during Hudi Writes

2021-07-25 Thread GitBox
codejoyan edited a comment on issue #2620: URL: https://github.com/apache/hudi/issues/2620#issuecomment-886077052 Some additional details for the above runs. 1. The configs I am using - REGULAR BLOOM. 2. Max and Min file size in older partitions - 116 MB and 6 MB respectively 3.

[GitHub] [hudi] fanaticjo commented on a change in pull request #3035: [HUDI-1936] Introduce a optional property for conditional upsert

2021-07-25 Thread GitBox
fanaticjo commented on a change in pull request #3035: URL: https://github.com/apache/hudi/pull/3035#discussion_r676157680 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/OverwriteWithCustomAvroPayload.java ## @@ -0,0 +1,107 @@ +/* + * Licensed to the

[GitHub] [hudi] hudi-bot edited a comment on pull request #3289: [HUDI-2187] Add a shim layer to support multiple hive version

2021-07-25 Thread GitBox
hudi-bot edited a comment on pull request #3289: URL: https://github.com/apache/hudi/pull/3289#issuecomment-881900670 ## CI report: * 83b766a3dcb36a3d495b2bf54400a9980086add4 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3289: [HUDI-2187] Add a shim layer to support multiple hive version

2021-07-25 Thread GitBox
hudi-bot edited a comment on pull request #3289: URL: https://github.com/apache/hudi/pull/3289#issuecomment-881900670 ## CI report: * 102f31feef4532b98343f06cf4684f2c513b684e Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3289: [HUDI-2187] Add a shim layer to support multiple hive version

2021-07-25 Thread GitBox
hudi-bot edited a comment on pull request #3289: URL: https://github.com/apache/hudi/pull/3289#issuecomment-881900670 ## CI report: * 102f31feef4532b98343f06cf4684f2c513b684e Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3340: [HUDI-2217] Fix no value present in incremental query on MOR

2021-07-25 Thread GitBox
hudi-bot edited a comment on pull request #3340: URL: https://github.com/apache/hudi/pull/3340#issuecomment-886153056 ## CI report: * 6ec2dbd0c8955eb2e2f18e0744a33b341b110795 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3340: [HUDI-2217] Fix no value present in incremental query on MOR

2021-07-25 Thread GitBox
hudi-bot edited a comment on pull request #3340: URL: https://github.com/apache/hudi/pull/3340#issuecomment-886153056 ## CI report: * 14ff661ef2404dedf27c9c541a51fff35b07fafd Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3340: [HUDI-2217] Fix no value present in incremental query on MOR

2021-07-25 Thread GitBox
hudi-bot edited a comment on pull request #3340: URL: https://github.com/apache/hudi/pull/3340#issuecomment-886153056 ## CI report: * 14ff661ef2404dedf27c9c541a51fff35b07fafd Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3341: [HUDI-2218] Fix missing HoodieWriteStat in HoodieCreateHandle

2021-07-25 Thread GitBox
hudi-bot edited a comment on pull request #3341: URL: https://github.com/apache/hudi/pull/3341#issuecomment-886156722 ## CI report: * d3d42fd04828b0c73fae5a497b35aa3d1629388f Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3341: [HUDI-2218] Fix missing HoodieWriteStat in HoodieCreateHandle

2021-07-25 Thread GitBox
hudi-bot edited a comment on pull request #3341: URL: https://github.com/apache/hudi/pull/3341#issuecomment-886156722 ## CI report: * d3d42fd04828b0c73fae5a497b35aa3d1629388f Azure:

[GitHub] [hudi] hudi-bot commented on pull request #3341: [HUDI-2218] Fix missing HoodieWriteStat in HoodieCreateHandle

2021-07-25 Thread GitBox
hudi-bot commented on pull request #3341: URL: https://github.com/apache/hudi/pull/3341#issuecomment-886156722 ## CI report: * d3d42fd04828b0c73fae5a497b35aa3d1629388f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis`

[GitHub] [hudi] garyli1019 opened a new pull request #3341: [HUDI-2218] Fix missing HoodieWriteStat in HoodieCreateHandle

2021-07-25 Thread GitBox
garyli1019 opened a new pull request #3341: URL: https://github.com/apache/hudi/pull/3341 ## What is the purpose of the pull request Some HoodieWriteStat being compute during the runtime was lost(e.g. compute min event time, which was stored at the HoodieWriteStat, in a customized

[GitHub] [hudi] hudi-bot edited a comment on pull request #3340: [HUDI-2217] Fix no value present in incremental query on MOR

2021-07-25 Thread GitBox
hudi-bot edited a comment on pull request #3340: URL: https://github.com/apache/hudi/pull/3340#issuecomment-886153056 ## CI report: * 14ff661ef2404dedf27c9c541a51fff35b07fafd Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3340: [HUDI-2217] Fix no value present in incremental query on MOR

2021-07-25 Thread GitBox
hudi-bot edited a comment on pull request #3340: URL: https://github.com/apache/hudi/pull/3340#issuecomment-886153056 ## CI report: * 14ff661ef2404dedf27c9c541a51fff35b07fafd Azure:

[GitHub] [hudi] hudi-bot commented on pull request #3340: [HUDI-2217] Fix no value present in incremental query on MOR

2021-07-25 Thread GitBox
hudi-bot commented on pull request #3340: URL: https://github.com/apache/hudi/pull/3340#issuecomment-886153056 ## CI report: * 14ff661ef2404dedf27c9c541a51fff35b07fafd UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis`

[GitHub] [hudi] garyli1019 opened a new pull request #3340: [HUDI-2217] Fix no value present in incremental query on MOR

2021-07-25 Thread GitBox
garyli1019 opened a new pull request #3340: URL: https://github.com/apache/hudi/pull/3340 ## What is the purpose of the pull request Fix no value present in incremental query on MOR ## Brief change log - *handle no instant being pull from incremental query* ##

[GitHub] [hudi] yanghua merged pull request #3339: [HUDI-2216] Correct the words fiels in the comments to fields

2021-07-24 Thread GitBox
yanghua merged pull request #3339: URL: https://github.com/apache/hudi/pull/3339 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] codejoyan commented on issue #2620: [SUPPORT] Performance Tuning: Slow stages (Building Workload Profile & Getting Small files from partitions) during Hudi Writes

2021-07-24 Thread GitBox
codejoyan commented on issue #2620: URL: https://github.com/apache/hudi/issues/2620#issuecomment-886077052 Some additional details for the above runs. 1. The configs I am using - REGULAR BLOOM. 2. Max and Min file size in older partitions - 116 MB and 6 MB respectively 3. Avg

[GitHub] [hudi] hudi-bot edited a comment on pull request #3328: [HUDI-2208] Support Bulk Insert For Spark Sql

2021-07-24 Thread GitBox
hudi-bot edited a comment on pull request #3328: URL: https://github.com/apache/hudi/pull/3328#issuecomment-884869427 ## CI report: * 9c9f804618dd0275abdae10673c21bf1f5737caf UNKNOWN * 50539ec543951e7a4442798ac7c66e5dc3d3705a UNKNOWN *

[GitHub] [hudi] MikelDelTio edited a comment on issue #2688: [SUPPORT] Sync to Hive using Metastore

2021-07-24 Thread GitBox
MikelDelTio edited a comment on issue #2688: URL: https://github.com/apache/hudi/issues/2688#issuecomment-881451261 Same problem here using emr-6.3.0 and hudi 0.7.0-amazn-0 Spark Session: ``` spark = ( SparkSession.builder.appName(spark_application_name)

[GitHub] [hudi] hudi-bot edited a comment on pull request #3339: [HUDI-2216]Correct the words 'fiels' in the comments to 'fields'

2021-07-24 Thread GitBox
hudi-bot edited a comment on pull request #3339: URL: https://github.com/apache/hudi/pull/3339#issuecomment-886013297 ## CI report: * 0b6ea7b45b975e1da3e1cf93abe527cb9d5b Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3328: [HUDI-2208] Support Bulk Insert For Spark Sql

2021-07-24 Thread GitBox
hudi-bot edited a comment on pull request #3328: URL: https://github.com/apache/hudi/pull/3328#issuecomment-884869427 ## CI report: * 9c9f804618dd0275abdae10673c21bf1f5737caf UNKNOWN * 50539ec543951e7a4442798ac7c66e5dc3d3705a UNKNOWN *

[GitHub] [hudi] hudi-bot edited a comment on pull request #3339: [HUDI-2216]Correct the words 'fiels' in the comments to 'fields'

2021-07-24 Thread GitBox
hudi-bot edited a comment on pull request #3339: URL: https://github.com/apache/hudi/pull/3339#issuecomment-886013297 ## CI report: * 0b6ea7b45b975e1da3e1cf93abe527cb9d5b Azure:

[GitHub] [hudi] hudi-bot commented on pull request #3339: [HUDI-2216]Correct the words 'fiels' in the comments to 'fields'

2021-07-24 Thread GitBox
hudi-bot commented on pull request #3339: URL: https://github.com/apache/hudi/pull/3339#issuecomment-886013297 ## CI report: * 0b6ea7b45b975e1da3e1cf93abe527cb9d5b UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis`

[GitHub] [hudi] dongkelun opened a new pull request #3339: [HUDI-2216]Correct the words 'fiels' in the comments to 'fields'

2021-07-24 Thread GitBox
dongkelun opened a new pull request #3339: URL: https://github.com/apache/hudi/pull/3339 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] nmukerje edited a comment on issue #3321: [SUPPORT] Setting _hoodie_is_deleted column is not deleting records when using Spark DataSource.

2021-07-24 Thread GitBox
nmukerje edited a comment on issue #3321: URL: https://github.com/apache/hudi/issues/3321#issuecomment-886012008 I am not using bulk insert for the upsert/delete. I am just bulk insert for Step 1 to stage some records. The notebook

[GitHub] [hudi] nmukerje edited a comment on issue #3321: [SUPPORT] Setting _hoodie_is_deleted column is not deleting records when using Spark DataSource.

2021-07-24 Thread GitBox
nmukerje edited a comment on issue #3321: URL: https://github.com/apache/hudi/issues/3321#issuecomment-886012008 I am not using bulk insert for the upsert/delete. I am just bulk insert for Step 1 to stage some records. The notebook

[GitHub] [hudi] nmukerje commented on issue #3321: [SUPPORT] Setting _hoodie_is_deleted column is not deleting records when using Spark DataSource.

2021-07-24 Thread GitBox
nmukerje commented on issue #3321: URL: https://github.com/apache/hudi/issues/3321#issuecomment-886012008 I am not using bulk insert for the upsert/delete. I am just bulk insert for Step 1 to stage some records. The notebook is public so you can run the cells/oode. The schema is

[GitHub] [hudi] hudi-bot edited a comment on pull request #3328: [HUDI-2208] Support Bulk Insert For Spark Sql

2021-07-24 Thread GitBox
hudi-bot edited a comment on pull request #3328: URL: https://github.com/apache/hudi/pull/3328#issuecomment-884869427 ## CI report: * 9c9f804618dd0275abdae10673c21bf1f5737caf UNKNOWN * 50539ec543951e7a4442798ac7c66e5dc3d3705a UNKNOWN *

[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #3328: [HUDI-2208] Support Bulk Insert For Spark Sql

2021-07-24 Thread GitBox
pengzhiwei2018 commented on a change in pull request #3328: URL: https://github.com/apache/hudi/pull/3328#discussion_r675960618 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestInsertTable.scala ## @@ -144,7 +145,7 @@ class

[GitHub] [hudi] mkk1490 commented on issue #3313: [SUPPORT] CoW: Hudi Upsert not working when there is a timestamp field in the composite key

2021-07-24 Thread GitBox
mkk1490 commented on issue #3313: URL: https://github.com/apache/hudi/issues/3313#issuecomment-886010620 @nsivabalan I'm so sorry. That's my mistake. I'm trying to update the next field to src_pri_psbr_id which is pri_az_cust_id. Please find the dfs below: Insert: df_ins =

[GitHub] [hudi] danny0405 closed pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-24 Thread GitBox
danny0405 closed pull request #3334: URL: https://github.com/apache/hudi/pull/3334 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] nsivabalan commented on issue #3313: [SUPPORT] CoW: Hudi Upsert not working when there is a timestamp field in the composite key

2021-07-24 Thread GitBox
nsivabalan commented on issue #3313: URL: https://github.com/apache/hudi/issues/3313#issuecomment-886006267 Can you confirm something. I see that you have "src_pri_psbr_id" as part of your record key fields. So, I was expecting to update a record, you will not touch any of the fields

[GitHub] [hudi] hudi-bot edited a comment on pull request #3338: [HUDI-2215] Add rateLimiter when Flink writes to hudi.

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3338: URL: https://github.com/apache/hudi/pull/3338#issuecomment-88662 ## CI report: * 44335b743bc600ca82282b2afeb208017e151c98 Azure:

[GitHub] [hudi] nsivabalan commented on issue #3280: [SUPPORT] Use structedstreaming to consume kafka to write to hudi error

2021-07-23 Thread GitBox
nsivabalan commented on issue #3280: URL: https://github.com/apache/hudi/issues/3280#issuecomment-886004864 I could see two exceptions from your stacktrace: 1. Caused by: java.lang.ClassNotFoundException: org.apache.hudi.DefaultSource 2. User class threw exception:

[GitHub] [hudi] nsivabalan edited a comment on issue #3336: [SUPPORT]

2021-07-23 Thread GitBox
nsivabalan edited a comment on issue #3336: URL: https://github.com/apache/hudi/issues/3336#issuecomment-886003132 Also, have you tried doing an update ("upsert") operation? Once that succeeds, then we know delete has some issue. but if update is failing, then could be some config issue

[GitHub] [hudi] nsivabalan edited a comment on issue #3336: [SUPPORT]

2021-07-23 Thread GitBox
nsivabalan edited a comment on issue #3336: URL: https://github.com/apache/hudi/issues/3336#issuecomment-886003132 Also, have you tried doing an "update" operation? Once that succeeds, then we know delete has some issue. but if update is failing, then could be some config issue or we

[GitHub] [hudi] nsivabalan commented on issue #3336: [SUPPORT]

2021-07-23 Thread GitBox
nsivabalan commented on issue #3336: URL: https://github.com/apache/hudi/issues/3336#issuecomment-886003132 Also, have you tried doing an update in general. Once that succeeds, then we know delete has some issue. but if update is failing, then could be some config issue or we might have

[GitHub] [hudi] nsivabalan commented on issue #3336: [SUPPORT]

2021-07-23 Thread GitBox
nsivabalan commented on issue #3336: URL: https://github.com/apache/hudi/issues/3336#issuecomment-886003055 have you configured your record keys and partition path correctly ? I see from your configs, you have given "two" for all 3 configs(record keys, partition path and preCombine).

[GitHub] [hudi] nsivabalan commented on issue #3337: [SUPPORT] Trouble getting yyyy/mm partitioning to work with Hive sync

2021-07-23 Thread GitBox
nsivabalan commented on issue #3337: URL: https://github.com/apache/hudi/issues/3337#issuecomment-886002077 If vinoth's suggestion, does not work, do let us know. Might have to investigate further. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] hudi-bot edited a comment on pull request #3338: [HUDI-2215] Add rateLimiter when Flink writes to hudi.

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3338: URL: https://github.com/apache/hudi/pull/3338#issuecomment-88662 ## CI report: * 44335b743bc600ca82282b2afeb208017e151c98 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #3338: [HUDI-2215] Add rateLimiter when Flink writes to hudi.

2021-07-23 Thread GitBox
hudi-bot commented on pull request #3338: URL: https://github.com/apache/hudi/pull/3338#issuecomment-88662 ## CI report: * 44335b743bc600ca82282b2afeb208017e151c98 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis`

[GitHub] [hudi] mincwang opened a new pull request #3338: [HUDI-2215] Add rateLimiter when Flink writes to hudi.

2021-07-23 Thread GitBox
mincwang opened a new pull request #3338: URL: https://github.com/apache/hudi/pull/3338 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] vinothchandar merged pull request #3302: [HUDI-1241] Automate the generation of configs webpage as configs are added to Hudi repo

2021-07-23 Thread GitBox
vinothchandar merged pull request #3302: URL: https://github.com/apache/hudi/pull/3302 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] danny0405 closed pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
danny0405 closed pull request #3334: URL: https://github.com/apache/hudi/pull/3334 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] leesf commented on a change in pull request #3330: [HUDI-2101][RFC-28]support z-order for hudi

2021-07-23 Thread GitBox
leesf commented on a change in pull request #3330: URL: https://github.com/apache/hudi/pull/3330#discussion_r675936746 ## File path: hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/Zoptimize.scala ## @@ -0,0 +1,750 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] leesf commented on a change in pull request #3330: [HUDI-2101][RFC-28]support z-order for hudi

2021-07-23 Thread GitBox
leesf commented on a change in pull request #3330: URL: https://github.com/apache/hudi/pull/3330#discussion_r675936657 ## File path: hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/Zoptimize.scala ## @@ -0,0 +1,750 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] codope commented on pull request #3285: [HUDI-1771] Propagate CDC format for hoodie

2021-07-23 Thread GitBox
codope commented on pull request #3285: URL: https://github.com/apache/hudi/pull/3285#issuecomment-885989683 > @codope please help me close out the schema evolution story here. > > my point was: when mixing old files where _hoodie_operation is NOT present with new files where it is

[GitHub] [hudi] leesf commented on a change in pull request #3330: [HUDI-2101][RFC-28]support z-order for hudi

2021-07-23 Thread GitBox
leesf commented on a change in pull request #3330: URL: https://github.com/apache/hudi/pull/3330#discussion_r675936467 ## File path: hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/OptimizeTableByCurve.scala ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] leesf commented on a change in pull request #3330: [HUDI-2101][RFC-28]support z-order for hudi

2021-07-23 Thread GitBox
leesf commented on a change in pull request #3330: URL: https://github.com/apache/hudi/pull/3330#discussion_r675936393 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/spark/sql/hudi/ZOrderingUtil.java ## @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] leesf commented on a change in pull request #3330: [HUDI-2101][RFC-28]support z-order for hudi

2021-07-23 Thread GitBox
leesf commented on a change in pull request #3330: URL: https://github.com/apache/hudi/pull/3330#discussion_r675936338 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/cluster/SparkExecuteClusteringCommitActionExecutor.java ## @@ -105,6

[GitHub] [hudi] leesf commented on a change in pull request #3330: [HUDI-2101][RFC-28]support z-order for hudi

2021-07-23 Thread GitBox
leesf commented on a change in pull request #3330: URL: https://github.com/apache/hudi/pull/3330#discussion_r675935594 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java ## @@ -182,6 +183,13 @@ public boolean

[GitHub] [hudi] leesf commented on a change in pull request #3330: [HUDI-2101][RFC-28]support z-order for hudi

2021-07-23 Thread GitBox
leesf commented on a change in pull request #3330: URL: https://github.com/apache/hudi/pull/3330#discussion_r675935583 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java ## @@ -182,6 +183,13 @@ public boolean

[GitHub] [hudi] leesf commented on a change in pull request #3330: [HUDI-2101][RFC-28]support z-order for hudi

2021-07-23 Thread GitBox
leesf commented on a change in pull request #3330: URL: https://github.com/apache/hudi/pull/3330#discussion_r675935750 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java ## @@ -238,6 +238,17 @@ private synchronized

[GitHub] [hudi] leesf commented on a change in pull request #3330: [HUDI-2101][RFC-28]support z-order for hudi

2021-07-23 Thread GitBox
leesf commented on a change in pull request #3330: URL: https://github.com/apache/hudi/pull/3330#discussion_r675935583 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java ## @@ -182,6 +183,13 @@ public boolean

[GitHub] [hudi] leesf commented on a change in pull request #3330: [HUDI-2101][RFC-28]support z-order for hudi

2021-07-23 Thread GitBox
leesf commented on a change in pull request #3330: URL: https://github.com/apache/hudi/pull/3330#discussion_r675935419 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/spark/sql/hudi/UnsafeAccess.java ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] leesf commented on a change in pull request #3330: [HUDI-2101][RFC-28]support z-order for hudi

2021-07-23 Thread GitBox
leesf commented on a change in pull request #3330: URL: https://github.com/apache/hudi/pull/3330#discussion_r675934932 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/clustering/SparkZSortAndSizeExecutionStrategy.java ## @@ -0,0 +1,142 @@ +/* + *

[GitHub] [hudi] leesf commented on a change in pull request #3330: [HUDI-2101][RFC-28]support z-order for hudi

2021-07-23 Thread GitBox
leesf commented on a change in pull request #3330: URL: https://github.com/apache/hudi/pull/3330#discussion_r675934932 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/clustering/SparkZSortAndSizeExecutionStrategy.java ## @@ -0,0 +1,142 @@ +/* + *

[GitHub] [hudi] danny0405 merged pull request #3319: [HUDI-2210] Replace deprecated method isDir with isDirectory

2021-07-23 Thread GitBox
danny0405 merged pull request #3319: URL: https://github.com/apache/hudi/pull/3319 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3334: URL: https://github.com/apache/hudi/pull/3334#issuecomment-885497812 ## CI report: * 6f7f38716d9a2c0ef10ebf2c349cdcf0e5f053de UNKNOWN * 6fecd2b043e75f3327d4a8f7348c7675682fcd70 Azure:

[GitHub] [hudi] vinothchandar commented on issue #3337: [SUPPORT] Trouble getting yyyy/mm partitioning to work with Hive sync

2021-07-23 Thread GitBox
vinothchandar commented on issue #3337: URL: https://github.com/apache/hudi/issues/3337#issuecomment-885979308 Have you tried ``` --partition-value-extractor org.apache.hudi.hive.MultiPartKeysValueExtractor --partitioned-by year, month ``` if you look at this code below, it

[GitHub] [hudi] hudi-bot edited a comment on pull request #3302: [HUDI-1241] Automate the generation of configs webpage as configs are added to Hudi repo

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3302: URL: https://github.com/apache/hudi/pull/3302#issuecomment-882806783 ## CI report: * 5b8bba6d9860a787ac7413a1ce0bedf142d6b4c7 UNKNOWN * 27c7c05ca1705faa76fde2e3e888311bb62b8d54 UNKNOWN *

[GitHub] [hudi] vinothchandar opened a new issue #3337: [SUPPORT] Trouble getting yyyy/mm partitioning to work with Hive sync

2021-07-23 Thread GitBox
vinothchandar opened a new issue #3337: URL: https://github.com/apache/hudi/issues/3337 **Describe the problem you faced** Hi, everyone! We ingest data with options: ``` hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator

[GitHub] [hudi] hudi-bot edited a comment on pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3334: URL: https://github.com/apache/hudi/pull/3334#issuecomment-885497812 ## CI report: * 5bc8824bcb6983e12596d79a4b9df0b9c42a4502 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3334: URL: https://github.com/apache/hudi/pull/3334#issuecomment-885497812 ## CI report: * 5bc8824bcb6983e12596d79a4b9df0b9c42a4502 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3334: URL: https://github.com/apache/hudi/pull/3334#issuecomment-885497812 ## CI report: * 5bc8824bcb6983e12596d79a4b9df0b9c42a4502 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3334: URL: https://github.com/apache/hudi/pull/3334#issuecomment-885497812 ## CI report: * bec06d2304c67b544befff79bc6559520024f7b3 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3302: [HUDI-1241] Automate the generation of configs webpage as configs are added to Hudi repo

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3302: URL: https://github.com/apache/hudi/pull/3302#issuecomment-882806783 ## CI report: * 5b8bba6d9860a787ac7413a1ce0bedf142d6b4c7 UNKNOWN * 27c7c05ca1705faa76fde2e3e888311bb62b8d54 UNKNOWN *

[GitHub] [hudi] hudi-bot edited a comment on pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3334: URL: https://github.com/apache/hudi/pull/3334#issuecomment-885497812 ## CI report: * bec06d2304c67b544befff79bc6559520024f7b3 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3302: [HUDI-1241] Automate the generation of configs webpage as configs are added to Hudi repo

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3302: URL: https://github.com/apache/hudi/pull/3302#issuecomment-882806783 ## CI report: * 5b8bba6d9860a787ac7413a1ce0bedf142d6b4c7 UNKNOWN * 27c7c05ca1705faa76fde2e3e888311bb62b8d54 UNKNOWN *

[GitHub] [hudi] hudi-bot edited a comment on pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3334: URL: https://github.com/apache/hudi/pull/3334#issuecomment-885497812 ## CI report: * bec06d2304c67b544befff79bc6559520024f7b3 Azure:

[GitHub] [hudi] danny0405 commented on a change in pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
danny0405 commented on a change in pull request #3334: URL: https://github.com/apache/hudi/pull/3334#discussion_r675907954 ## File path: hudi-flink/src/main/java/org/apache/hudi/sink/bulk/BulkInsertWriterHelper.java ## @@ -0,0 +1,192 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] mithalee opened a new issue #3336: [SUPPORT]

2021-07-23 Thread GitBox
mithalee opened a new issue #3336: URL: https://github.com/apache/hudi/issues/3336 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://cwiki.apache.org/confluence/display/HUDI/FAQ)? - Join the mailing list to engage in conversations and get faster

[GitHub] [hudi] vinothchandar commented on pull request #3285: [HUDI-1771] Propagate CDC format for hoodie

2021-07-23 Thread GitBox
vinothchandar commented on pull request #3285: URL: https://github.com/apache/hudi/pull/3285#issuecomment-885764276 @codope please help me close out the schema evolution story here. my point was: when mixing old files where _hoodie_operation is NOT present with new files where it

[GitHub] [hudi] yuzhaojing commented on a change in pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
yuzhaojing commented on a change in pull request #3334: URL: https://github.com/apache/hudi/pull/3334#discussion_r675673937 ## File path: hudi-flink/src/main/java/org/apache/hudi/sink/bulk/BulkInsertWriterHelper.java ## @@ -0,0 +1,192 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] FelixKJose edited a comment on issue #3323: [SUPPORT] Trouble with Point in time, Incremental queries

2021-07-23 Thread GitBox
FelixKJose edited a comment on issue #3323: URL: https://github.com/apache/hudi/issues/3323#issuecomment-885762685 @nsivabalan is this only a problem in 0.8.0 version or older version as well? Could you please provide some insights? -- This is an automated message from the Apache Git

[GitHub] [hudi] FelixKJose commented on issue #3323: [SUPPORT] Trouble with Point in time, Incremental queries

2021-07-23 Thread GitBox
FelixKJose commented on issue #3323: URL: https://github.com/apache/hudi/issues/3323#issuecomment-885762685 @nsivabalan is this only a problem in 0.8.0 version or older version as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [hudi] hudi-bot edited a comment on pull request #3233: [HUDI-1138] Add timeline-server-based marker file strategy for improving marker-related latency

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3233: URL: https://github.com/apache/hudi/pull/3233#issuecomment-875280958 ## CI report: * 2d22335c215ed620ce20018b1c83be189b7c70c6 UNKNOWN * 230205edfab190cfaf687d0323ae8d704f425e1d UNKNOWN *

[GitHub] [hudi] satishkotha merged pull request #2879: [HUDI-1848] Adding support for HMS for running DDL queries in hive-sy…

2021-07-23 Thread GitBox
satishkotha merged pull request #2879: URL: https://github.com/apache/hudi/pull/2879 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3233: [HUDI-1138] Add timeline-server-based marker file strategy for improving marker-related latency

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3233: URL: https://github.com/apache/hudi/pull/3233#issuecomment-875280958 ## CI report: * 2d22335c215ed620ce20018b1c83be189b7c70c6 UNKNOWN * f1095198d43636de20b525d80341c32f84591d48 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3233: [HUDI-1138] Add timeline-server-based marker file strategy for improving marker-related latency

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3233: URL: https://github.com/apache/hudi/pull/3233#issuecomment-875280958 ## CI report: * 2d22335c215ed620ce20018b1c83be189b7c70c6 UNKNOWN * f1095198d43636de20b525d80341c32f84591d48 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3334: URL: https://github.com/apache/hudi/pull/3334#issuecomment-885497812 ## CI report: * bec06d2304c67b544befff79bc6559520024f7b3 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3334: URL: https://github.com/apache/hudi/pull/3334#issuecomment-885497812 ## CI report: * d23d24f8cee6fd55de1e433f1c07b40a2e0a3391 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3334: URL: https://github.com/apache/hudi/pull/3334#issuecomment-885497812 ## CI report: * d23d24f8cee6fd55de1e433f1c07b40a2e0a3391 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * ec01ad1f162813a5fafb7d14da7b65eea64d06ea Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3328: [HUDI-2208] Support Bulk Insert For Spark Sql

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3328: URL: https://github.com/apache/hudi/pull/3328#issuecomment-884869427 ## CI report: * 9c9f804618dd0275abdae10673c21bf1f5737caf UNKNOWN * 50539ec543951e7a4442798ac7c66e5dc3d3705a UNKNOWN *

[GitHub] [hudi] yanghua merged pull request #3333: [HUDI-2213] Remove unnecessary parameter for HoodieMetrics constructo…

2021-07-23 Thread GitBox
yanghua merged pull request #: URL: https://github.com/apache/hudi/pull/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * ec01ad1f162813a5fafb7d14da7b65eea64d06ea Azure:

[GitHub] [hudi] yuzhaojing commented on pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
yuzhaojing commented on pull request #3334: URL: https://github.com/apache/hudi/pull/3334#issuecomment-885575380 I'm excited to see this feature! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] zhangyue19921010 commented on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering

2021-07-23 Thread GitBox
zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-885574686 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] hudi-bot edited a comment on pull request #3277: [HUDI-2182] Support Compaction Command For Spark Sql

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3277: URL: https://github.com/apache/hudi/pull/3277#issuecomment-880506300 ## CI report: * 0eeef02b58baa7ade8fc0196c2c16c165daafcdf Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3328: [HUDI-2208] Support Bulk Insert For Spark Sql

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3328: URL: https://github.com/apache/hudi/pull/3328#issuecomment-884869427 ## CI report: * 9c9f804618dd0275abdae10673c21bf1f5737caf UNKNOWN * 50539ec543951e7a4442798ac7c66e5dc3d3705a UNKNOWN *

[GitHub] [hudi] hudi-bot edited a comment on pull request #3335: [HUDI-2214] fix the bug that residual temporary files after clustering are not cleaned up

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3335: URL: https://github.com/apache/hudi/pull/3335#issuecomment-885523651 ## CI report: * 9bebeaf2c723810d6c6d5df00e4d6f36b4f478e4 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3328: [HUDI-2208] Support Bulk Insert For Spark Sql

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3328: URL: https://github.com/apache/hudi/pull/3328#issuecomment-884869427 ## CI report: * 9c9f804618dd0275abdae10673c21bf1f5737caf UNKNOWN * 50539ec543951e7a4442798ac7c66e5dc3d3705a UNKNOWN *

[GitHub] [hudi] hudi-bot edited a comment on pull request #3189: [HUDI-2098] Support hdfs file lock

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3189: URL: https://github.com/apache/hudi/pull/3189#issuecomment-871217447 ## CI report: * 51ad1f1e7316e61db862dc1d2934d4d5fc54848b Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3334: [HUDI-2209] Bulk insert for flink writer

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3334: URL: https://github.com/apache/hudi/pull/3334#issuecomment-885497812 ## CI report: * d23d24f8cee6fd55de1e433f1c07b40a2e0a3391 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3285: [HUDI-1771] Propagate CDC format for hoodie

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3285: URL: https://github.com/apache/hudi/pull/3285#issuecomment-881141261 ## CI report: * 4660e96db4081115eaa7877b8584466347f78fea UNKNOWN * a14726c462aeb682391bf762f4a9d8e03a0da25b Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3277: [HUDI-2182] Support Compaction Command For Spark Sql

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3277: URL: https://github.com/apache/hudi/pull/3277#issuecomment-880506300 ## CI report: * ddbbd6098b0c74a85231f14a5b6909b671f406b0 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3277: [HUDI-2182] Support Compaction Command For Spark Sql

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3277: URL: https://github.com/apache/hudi/pull/3277#issuecomment-880506300 ## CI report: * ddbbd6098b0c74a85231f14a5b6909b671f406b0 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3335: [HUDI-2214] fix the bug that residual temporary files after clustering are not cleaned up

2021-07-23 Thread GitBox
hudi-bot edited a comment on pull request #3335: URL: https://github.com/apache/hudi/pull/3335#issuecomment-885523651 ## CI report: * 9bebeaf2c723810d6c6d5df00e4d6f36b4f478e4 Azure:

  1   2   3   4   5   6   7   8   9   10   >