[GitHub] [incubator-hudi] dengziming commented on pull request #1151: [HUDI-476] Add hudi-examples module

2020-05-14 Thread GitBox
dengziming commented on pull request #1151: URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-629044776 > Can you confirm if you have run these examples locally once and verified the instructions work? @vinothchandar , I ran these examples locally and ensured they

[incubator-hudi] branch master updated: HUDI-528 Handle empty commit in incremental pulling (#1612)

2020-05-14 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/master by this push: new a64afdf HUDI-528 Handle empty commit

[GitHub] [incubator-hudi] bvaradar merged pull request #1612: [HUDI-528] Handle empty commit in incremental pulling

2020-05-14 Thread GitBox
bvaradar merged pull request #1612: URL: https://github.com/apache/incubator-hudi/pull/1612 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [incubator-hudi] bvaradar closed pull request #1532: [HUDI-794]: implemented optional use of --config-folder option in HoodieDeltaStreamer

2020-05-14 Thread GitBox
bvaradar closed pull request #1532: URL: https://github.com/apache/incubator-hudi/pull/1532 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1532: [HUDI-794]: implemented optional use of --config-folder option in HoodieDeltaStreamer

2020-05-14 Thread GitBox
bvaradar commented on a change in pull request #1532: URL: https://github.com/apache/incubator-hudi/pull/1532#discussion_r425579387 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/DeltaStreamerUtility.java ## @@ -0,0 +1,128 @@ +/* + * Licensed to the

[GitHub] [incubator-hudi] bvaradar commented on pull request #1520: [HUDI-797] Small performance improvement for rewriting records.

2020-05-14 Thread GitBox
bvaradar commented on pull request #1520: URL: https://github.com/apache/incubator-hudi/pull/1520#issuecomment-629040082 @prashantwason : Looking at the comments, it looks like this PR is going to be abandoned. If so, can you please close this PR.

[GitHub] [incubator-hudi] yanghua commented on pull request #1611: [HUDI-705]Add unit test for RollbacksCommand

2020-05-14 Thread GitBox
yanghua commented on pull request #1611: URL: https://github.com/apache/incubator-hudi/pull/1611#issuecomment-629039996 > @yanghua you can review this as well if possible :) OK This is an automated message from the

[GitHub] [incubator-hudi] dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

2020-05-14 Thread GitBox
dengziming commented on a change in pull request #1151: URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r425577242 ## File path: hudi-examples/pom.xml ## @@ -0,0 +1,198 @@ + + +http://maven.apache.org/POM/4.0.0;

[GitHub] [incubator-hudi] bvaradar closed issue #1586: [SUPPORT] DMS with 2 key example

2020-05-14 Thread GitBox
bvaradar closed issue #1586: URL: https://github.com/apache/incubator-hudi/issues/1586 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [incubator-hudi] bvaradar commented on issue #1630: [SUPPORT] Latest commit does not have any schema in commit metadata

2020-05-14 Thread GitBox
bvaradar commented on issue #1630: URL: https://github.com/apache/incubator-hudi/issues/1630#issuecomment-629037540 Cherry-picking selective diffs is always a tricky business. Maybe you can use master or use 0.5.2 and apply the patch and try. Also, 0.5.3 release is going to happen shortly

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1151: [HUDI-476] Add hudi-examples module

2020-05-14 Thread GitBox
codecov-io edited a comment on pull request #1151: URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-593277561 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1602: [HUDI-494] fix incorrect record size estimation

2020-05-14 Thread GitBox
codecov-io edited a comment on pull request #1602: URL: https://github.com/apache/incubator-hudi/pull/1602#issuecomment-628183877 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1602?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] zherenyu831 commented on issue #1631: [SUPPORT] After changed schema could not update schema of Hive

2020-05-14 Thread GitBox
zherenyu831 commented on issue #1631: URL: https://github.com/apache/incubator-hudi/issues/1631#issuecomment-629015268 @bvaradar Thank you, I think this is the only solution could works now This is an automated message

[GitHub] [incubator-hudi] zherenyu831 closed issue #1631: [SUPPORT] After changed schema could not update schema of Hive

2020-05-14 Thread GitBox
zherenyu831 closed issue #1631: URL: https://github.com/apache/incubator-hudi/issues/1631 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1602: [HUDI-494] fix incorrect record size estimation

2020-05-14 Thread GitBox
codecov-io edited a comment on pull request #1602: URL: https://github.com/apache/incubator-hudi/pull/1602#issuecomment-628183877 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1602?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] garyli1019 commented on pull request #1592: [Hudi-69] Spark Datasource for MOR table

2020-05-14 Thread GitBox
garyli1019 commented on pull request #1592: URL: https://github.com/apache/incubator-hudi/pull/1592#issuecomment-629011466 Thanks @xushiyan! I will give it a try. This is an automated message from the Apache Git Service. To

[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1433: [HUDI-728]: Implement custom key generator

2020-05-14 Thread GitBox
nsivabalan commented on a change in pull request #1433: URL: https://github.com/apache/incubator-hudi/pull/1433#discussion_r425547719 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/SimpleKeyGenerator.java ## @@ -66,6 +68,14 @@ public HoodieKey

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #278

2020-05-14 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.38 KB...] /home/jenkins/tools/maven/apache-maven-3.5.4/conf: logging settings.xml toolchains.xml

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1402: [HUDI-407] Adding Simple Index

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1402: URL: https://github.com/apache/incubator-hudi/pull/1402#issuecomment-629007570 Yes. Please and a good commit message, given this is an important feature This is an automated

[GitHub] [incubator-hudi] garyli1019 commented on pull request #1602: [HUDI-494] fix incorrect record size estimation

2020-05-14 Thread GitBox
garyli1019 commented on pull request #1602: URL: https://github.com/apache/incubator-hudi/pull/1602#issuecomment-629005508 switched to `totalBytesWritten > hoodieWriteConfig.getParquetSmallFileLimit()`. I think this way would have minimal impact and handle this bug.

[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1409: [HUDI-714]Add javadoc and comments to hudi write method link

2020-05-14 Thread GitBox
nsivabalan commented on a change in pull request #1409: URL: https://github.com/apache/incubator-hudi/pull/1409#discussion_r425540622 ## File path: hudi-spark/src/main/java/org/apache/hudi/DataSourceUtils.java ## @@ -241,6 +241,13 @@ public static HoodieRecord

[GitHub] [incubator-hudi] nsivabalan commented on pull request #1402: [HUDI-407] Adding Simple Index

2020-05-14 Thread GitBox
nsivabalan commented on pull request #1402: URL: https://github.com/apache/incubator-hudi/pull/1402#issuecomment-628997531 @vinothchandar : tests are passing. Let me know if you want me to squash all commits This is an

[GitHub] [incubator-hudi] yanghua commented on pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-14 Thread GitBox
yanghua commented on pull request #1558: URL: https://github.com/apache/incubator-hudi/pull/1558#issuecomment-628985363 > I will let @yanghua see this home OK, and @pratyakshsharma first of all, please fix all the conflicting files.

[GitHub] [incubator-hudi] dengziming commented on pull request #1151: [HUDI-476] Add hudi-examples module

2020-05-14 Thread GitBox
dengziming commented on pull request #1151: URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-628968945 @vinothchandar sorry, a little busy these days, I will addressed your comments in a few days. This

[GitHub] [incubator-hudi] leesf commented on pull request #1622: [HUDI-888] fix NullPointerException

2020-05-14 Thread GitBox
leesf commented on pull request #1622: URL: https://github.com/apache/incubator-hudi/pull/1622#issuecomment-628966948 @rolandjohann Would you please check why the travis is red? This is an automated message from the Apache

[GitHub] [incubator-hudi] codecov-io commented on pull request #1616: [HUDI-786] Fixing read beyond inline length in InlineFS

2020-05-14 Thread GitBox
codecov-io commented on pull request #1616: URL: https://github.com/apache/incubator-hudi/pull/1616#issuecomment-628960229 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1616?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] v3nkatesh commented on pull request #1484: [HUDI-316] : Hbase qps repartition writestatus

2020-05-14 Thread GitBox
v3nkatesh commented on pull request #1484: URL: https://github.com/apache/incubator-hudi/pull/1484#issuecomment-628953098 > @v3nkatesh There are a couple of pending comments from @satishkotha. If you can finish up with those we can merge this PR with the condition that we need to add to

[GitHub] [incubator-hudi] v3nkatesh commented on a change in pull request #1484: [HUDI-316] : Hbase qps repartition writestatus

2020-05-14 Thread GitBox
v3nkatesh commented on a change in pull request #1484: URL: https://github.com/apache/incubator-hudi/pull/1484#discussion_r425498242 ## File path: hudi-client/src/main/java/org/apache/hudi/index/hbase/HBaseIndex.java ## @@ -83,13 +88,14 @@ private static final byte[]

[GitHub] [incubator-hudi] v3nkatesh commented on a change in pull request #1484: [HUDI-316] : Hbase qps repartition writestatus

2020-05-14 Thread GitBox
v3nkatesh commented on a change in pull request #1484: URL: https://github.com/apache/incubator-hudi/pull/1484#discussion_r425496973 ## File path: hudi-client/src/main/java/org/apache/hudi/index/hbase/HBaseIndex.java ## @@ -322,66 +347,94 @@ private boolean

[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1616: [HUDI-786] Fixing read beyond inline length in InlineFS

2020-05-14 Thread GitBox
nsivabalan commented on a change in pull request #1616: URL: https://github.com/apache/incubator-hudi/pull/1616#discussion_r425494577 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/inline/InLineFsDataInputStream.java ## @@ -56,24 +56,29 @@ public long

[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1616: [HUDI-786] Fixing read beyond inline length in InlineFS

2020-05-14 Thread GitBox
nsivabalan commented on a change in pull request #1616: URL: https://github.com/apache/incubator-hudi/pull/1616#discussion_r425494577 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/inline/InLineFsDataInputStream.java ## @@ -56,24 +56,29 @@ public long

[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1616: [HUDI-786] Fixing read beyond inline length in InlineFS

2020-05-14 Thread GitBox
nsivabalan commented on a change in pull request #1616: URL: https://github.com/apache/incubator-hudi/pull/1616#discussion_r425494577 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/inline/InLineFsDataInputStream.java ## @@ -56,24 +56,29 @@ public long

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1484: [HUDI-316] : Hbase qps repartition writestatus

2020-05-14 Thread GitBox
n3nash commented on a change in pull request #1484: URL: https://github.com/apache/incubator-hudi/pull/1484#discussion_r425489735 ## File path: hudi-client/src/main/java/org/apache/hudi/index/hbase/HBaseIndex.java ## @@ -83,13 +88,14 @@ private static final byte[]

[GitHub] [incubator-hudi] garyli1019 commented on pull request #1602: [HUDI-494] fix incorrect record size estimation

2020-05-14 Thread GitBox
garyli1019 commented on pull request #1602: URL: https://github.com/apache/incubator-hudi/pull/1602#issuecomment-628934924 > commit1 only wrote 1 record but the parquet file is 20MB @vinothchandar Sorry this example is bad... Let's say 8MB(2M entries) bloom filter + 200 records

[GitHub] [incubator-hudi] EdwinGuo commented on issue #1630: [SUPPORT] Latest commit does not have any schema in commit metadata

2020-05-14 Thread GitBox
EdwinGuo commented on issue #1630: URL: https://github.com/apache/incubator-hudi/issues/1630#issuecomment-628934279 Thanks @bvaradar. Do you think it's make sense to support post 0.5.1 to support delete 0.5.0 or older? Where does the schema is being stored at for 0.5.0? Is is only refer

[GitHub] [incubator-hudi] bvaradar commented on issue #1630: [SUPPORT] Latest commit does not have any schema in commit metadata

2020-05-14 Thread GitBox
bvaradar commented on issue #1630: URL: https://github.com/apache/incubator-hudi/issues/1630#issuecomment-628918822 No, I meant the feature of writing schema to commit file. It was added in 0.5.1. Pre-0.5.1 commit files won't have schema in commit metadata.

[GitHub] [incubator-hudi] xushiyan edited a comment on pull request #1514: [WIP] [HUDI-774] Addressing incorrect Spark to Avro schema generation

2020-05-14 Thread GitBox
xushiyan edited a comment on pull request #1514: URL: https://github.com/apache/incubator-hudi/pull/1514#issuecomment-628915411 @afilipchik The master codebase has been migrated to JUnit 5. Please kindly rebase and update the usage to Junit 5 APIs where applicable.

[GitHub] [incubator-hudi] xushiyan commented on pull request #1514: [WIP] [HUDI-774] Addressing incorrect Spark to Avro schema generation

2020-05-14 Thread GitBox
xushiyan commented on pull request #1514: URL: https://github.com/apache/incubator-hudi/pull/1514#issuecomment-628915411 @afilipchik The master codebase has been migrated to JUnit 5. Please kindly upgrade the usage to Junit 5 APIs.

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1602: [HUDI-494] fix incorrect record size estimation

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1602: URL: https://github.com/apache/incubator-hudi/pull/1602#issuecomment-628910318 @garyli1019 thinking about it, even today without the bloom filters, the parquet size include additional stats and metadata contained internally.. So, it's never

[GitHub] [incubator-hudi] vinothchandar edited a comment on pull request #1602: [HUDI-494] fix incorrect record size estimation

2020-05-14 Thread GitBox
vinothchandar edited a comment on pull request #1602: URL: https://github.com/apache/incubator-hudi/pull/1602#issuecomment-628910318 @garyli1019 thinking about it, even today without the bloom filters, the parquet size include additional stats and metadata contained internally.. So, it's

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1509: [HUDI-525] lack of insert info in delta_commit inflight

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1509: URL: https://github.com/apache/incubator-hudi/pull/1509#issuecomment-628902977 @n3nash is this ready to land This is an automated message from the Apache Git Service. To

[GitHub] [incubator-hudi] vinothchandar closed pull request #1387: [WIP] [HUDI-674] Rename hudi-hadoop-mr-bundle to hudi-hive-bundle

2020-05-14 Thread GitBox
vinothchandar closed pull request #1387: URL: https://github.com/apache/incubator-hudi/pull/1387 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1562: [HUDI-837]: implemented custom deserializer for AvroKafkaSource

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1562: URL: https://github.com/apache/incubator-hudi/pull/1562#issuecomment-628899382 @n3nash can you review this and take it home? This is an automated message from the Apache Git

[GitHub] [incubator-hudi] vinothchandar closed pull request #1253: [WIP] [HUDI-558] Introduce ability to compress bloom filters while storing in parquet

2020-05-14 Thread GitBox
vinothchandar closed pull request #1253: URL: https://github.com/apache/incubator-hudi/pull/1253 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1597: [WIP] Added a MultiFormatTimestampBasedKeyGenerator that allows for multipl…

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1597: URL: https://github.com/apache/incubator-hudi/pull/1597#issuecomment-628898626 @pratyakshsharma in the meantime, if you want to absorb this into #1433 , we can do that as well. Assuming @allenerb does not mind..

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1597: Added a MultiFormatTimestampBasedKeyGenerator that allows for multipl…

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1597: URL: https://github.com/apache/incubator-hudi/pull/1597#issuecomment-628898234 Actually, thinking again.. we can take the time we want on this and get this into 0.6.0.. will make this as WIP and come back to it after 0.5.3 is pushed out

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1597: Added a MultiFormatTimestampBasedKeyGenerator that allows for multipl…

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1597: URL: https://github.com/apache/incubator-hudi/pull/1597#issuecomment-628896478 @allenerb wondering if you have a JIRA for this already.. This is an automated message from the

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1565: [HUDI-73]: implemented vanilla AvroKafkaSource

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1565: URL: https://github.com/apache/incubator-hudi/pull/1565#issuecomment-628889429 Overall, this PR is nice in the sense that it let's us read data from Kafka using AVRO, with a fixed schema.. but then, it cannot handle evolutions that well

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1565: [HUDI-73]: implemented vanilla AvroKafkaSource

2020-05-14 Thread GitBox
vinothchandar commented on a change in pull request #1565: URL: https://github.com/apache/incubator-hudi/pull/1565#discussion_r425432347 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/serde/AbstractHoodieKafkaAvroDeserializer.java ## @@ -0,0 +1,97 @@ +/*

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1562: [HUDI-837]: implemented custom deserializer for AvroKafkaSource

2020-05-14 Thread GitBox
vinothchandar commented on a change in pull request #1562: URL: https://github.com/apache/incubator-hudi/pull/1562#discussion_r425429344 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/serde/HoodieAvroKafkaDeserializer.java ## @@ -0,0 +1,78 @@ +/*

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1562: [HUDI-837]: implemented custom deserializer for AvroKafkaSource

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1562: URL: https://github.com/apache/incubator-hudi/pull/1562#issuecomment-628885456 > Not sure of how to mock the same here since it is library class. We can just mock the response it will send into a test SchemaProvider.. We need not mock

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1562: [HUDI-837]: implemented custom deserializer for AvroKafkaSource

2020-05-14 Thread GitBox
vinothchandar commented on a change in pull request #1562: URL: https://github.com/apache/incubator-hudi/pull/1562#discussion_r425421081 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/AvroKafkaSource.java ## @@ -45,11 +46,14 @@ private final

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1433: [HUDI-728]: Implement custom key generator

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1433: URL: https://github.com/apache/incubator-hudi/pull/1433#issuecomment-628873883 @nsivabalan can you shepherd this one home from here> This is an automated message from the

[GitHub] [incubator-hudi] vinothchandar merged pull request #1541: [HUDI-843] Add ability to specify time unit for TimestampBasedKeyGenerator

2020-05-14 Thread GitBox
vinothchandar merged pull request #1541: URL: https://github.com/apache/incubator-hudi/pull/1541 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[incubator-hudi] branch master updated: [HUDI-843] Add ability to specify time unit for TimestampBasedKeyGenerator (#1541)

2020-05-14 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/master by this push: new f094f42 [HUDI-843] Add ability to

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1433: [HUDI-728]: Implement custom key generator

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1433: URL: https://github.com/apache/incubator-hudi/pull/1433#issuecomment-628873197 @pratyakshsharma Rebased and removed the parquet files etc.. This is an automated message from

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1597: Added a MultiFormatTimestampBasedKeyGenerator that allows for multipl…

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1597: URL: https://github.com/apache/incubator-hudi/pull/1597#issuecomment-628868262 Hi @allenerb sg.. Will do some minor changes and try to get this landed. Will file a follow up JIRA.. which you or @pratyakshsharma or someone can take up ..

[jira] [Created] (HUDI-900) Metadata Bootstrap Key Generator needs to handle complex keys correctly

2020-05-14 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-900: --- Summary: Metadata Bootstrap Key Generator needs to handle complex keys correctly Key: HUDI-900 URL: https://issues.apache.org/jira/browse/HUDI-900 Project:

[jira] [Assigned] (HUDI-900) Metadata Bootstrap Key Generator needs to handle complex keys correctly

2020-05-14 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-900: --- Assignee: Balaji Varadarajan > Metadata Bootstrap Key Generator needs to handle

[jira] [Updated] (HUDI-900) Metadata Bootstrap Key Generator needs to handle complex keys correctly

2020-05-14 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-900: Status: Open (was: New) > Metadata Bootstrap Key Generator needs to handle complex keys

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1612: [HUDI-528] Handle empty commit in incremental pulling

2020-05-14 Thread GitBox
codecov-io edited a comment on pull request #1612: URL: https://github.com/apache/incubator-hudi/pull/1612#issuecomment-626417448 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1612?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1612: [HUDI-528] Handle empty commit in incremental pulling

2020-05-14 Thread GitBox
codecov-io edited a comment on pull request #1612: URL: https://github.com/apache/incubator-hudi/pull/1612#issuecomment-626417448 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1612?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1612: [HUDI-528] Handle empty commit in incremental pulling

2020-05-14 Thread GitBox
codecov-io edited a comment on pull request #1612: URL: https://github.com/apache/incubator-hudi/pull/1612#issuecomment-626417448 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1612?src=pr=h1) Report > Merging

[jira] [Created] (HUDI-899) Add a knob to change partition-path style while performing metadata bootstrap

2020-05-14 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-899: --- Summary: Add a knob to change partition-path style while performing metadata bootstrap Key: HUDI-899 URL: https://issues.apache.org/jira/browse/HUDI-899

[GitHub] [incubator-hudi] allenerb commented on pull request #1597: Added a MultiFormatTimestampBasedKeyGenerator that allows for multipl…

2020-05-14 Thread GitBox
allenerb commented on pull request #1597: URL: https://github.com/apache/incubator-hudi/pull/1597#issuecomment-628853876 Hi Vinoth, Apologies for not getting back to this PR to make changes. I’ve been swamped trying to get Hudi working in the environment (still struggling with

[GitHub] [incubator-hudi] xushiyan commented on pull request #1592: [Hudi-69] Spark Datasource for MOR table

2020-05-14 Thread GitBox
xushiyan commented on pull request #1592: URL: https://github.com/apache/incubator-hudi/pull/1592#issuecomment-628845961 @garyli1019 it does look very weird... getting NPE at org.apache.hudi.functional.TestDataSource.testSparkDatasourceForMergeOnRead(TestDataSource.scala:227, means the

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1597: Added a MultiFormatTimestampBasedKeyGenerator that allows for multipl…

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1597: URL: https://github.com/apache/incubator-hudi/pull/1597#issuecomment-628843862 @allenerb Trying to understand the next steps here.. Are we deciding between doing a separate class and merging this into the existing class? I am happy to

[jira] [Created] (HUDI-898) Need to add Schema parameter to HoodieRecordPayload::preCombine

2020-05-14 Thread Yixue (Andrew) Zhu (Jira)
Yixue (Andrew) Zhu created HUDI-898: --- Summary: Need to add Schema parameter to HoodieRecordPayload::preCombine Key: HUDI-898 URL: https://issues.apache.org/jira/browse/HUDI-898 Project: Apache Hudi

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1602: [HUDI-494] fix incorrect record size estimation

2020-05-14 Thread GitBox
codecov-io edited a comment on pull request #1602: URL: https://github.com/apache/incubator-hudi/pull/1602#issuecomment-628183877 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1602?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] garyli1019 commented on pull request #1592: [Hudi-69] Spark Datasource for MOR table

2020-05-14 Thread GitBox
garyli1019 commented on pull request #1592: URL: https://github.com/apache/incubator-hudi/pull/1592#issuecomment-628828782 Hello @xushiyan , may I ask a question regarding the functional testing? I believe you are the expert on this topic in our community :) This PR passed CI in my

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1602: [HUDI-494] fix incorrect record size estimation

2020-05-14 Thread GitBox
codecov-io edited a comment on pull request #1602: URL: https://github.com/apache/incubator-hudi/pull/1602#issuecomment-628183877 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1602?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1584: fix schema provider issue

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1584: URL: https://github.com/apache/incubator-hudi/pull/1584#issuecomment-628828057 To reduce context switch, assigning to @bvaradar who is looking into couple other PRs around schema provider

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1402: [HUDI-407] Adding Simple Index

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1402: URL: https://github.com/apache/incubator-hudi/pull/1402#issuecomment-628827254 @nsivabalan please ping me when the tests are passing.. WIll make a final pass and land This is

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1596: [HUDI-863] get decimal properties from derived spark DataType

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1596: URL: https://github.com/apache/incubator-hudi/pull/1596#issuecomment-628821960 Added you as a contributor on jira.. So you should be able to claim those now! let us know if you still face issues..

[jira] [Commented] (HUDI-864) parquet schema conflict: optional binary (UTF8) is not a group

2020-05-14 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107561#comment-17107561 ] Vinoth Chandar commented on HUDI-864: - oops.. slipped past my radar.. I prefer not to get into

[jira] [Updated] (HUDI-864) parquet schema conflict: optional binary (UTF8) is not a group

2020-05-14 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-864: Labels: (was: bug-bash-0.6.0) > parquet schema conflict: optional binary (UTF8) is not a group >

[jira] [Updated] (HUDI-864) parquet schema conflict: optional binary (UTF8) is not a group

2020-05-14 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-864: Affects Version/s: 0.5.2 > parquet schema conflict: optional binary (UTF8) is not a group >

[GitHub] [incubator-hudi] rolandjohann commented on pull request #1596: [HUDI-863] get decimal properties from derived spark DataType

2020-05-14 Thread GitBox
rolandjohann commented on pull request #1596: URL: https://github.com/apache/incubator-hudi/pull/1596#issuecomment-628817206 @nsivabalan it seems that I can't do that. "assign to me" link is missing and klicking on Assignee "Unassigned" doesn't transform the field to the desired people

[GitHub] [incubator-hudi] garyli1019 commented on pull request #1602: [HUDI-494] fix incorrect record size estimation

2020-05-14 Thread GitBox
garyli1019 commented on pull request #1602: URL: https://github.com/apache/incubator-hudi/pull/1602#issuecomment-628817124 @vinothchandar I definitely agree a statistical table would be a better approach, but it will take a while I believe. I am happy to contribute to this topic as well.

[jira] [Updated] (HUDI-863) nested structs containing decimal types lead to null pointer exception

2020-05-14 Thread Roland Johann (Jira)
[ https://issues.apache.org/jira/browse/HUDI-863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roland Johann updated HUDI-863: --- Status: Patch Available (was: In Progress) > nested structs containing decimal types lead to null

[GitHub] [incubator-hudi] bvaradar commented on issue #1631: [SUPPORT] After changed schema could not update schema of Hive

2020-05-14 Thread GitBox
bvaradar commented on issue #1631: URL: https://github.com/apache/incubator-hudi/issues/1631#issuecomment-628801677 @zherenyu831 : Schema evolution rules does not allow for adding columns in the middle of schema. You would need to add them at the end.

[GitHub] [incubator-hudi] EdwinGuo edited a comment on issue #1630: [SUPPORT] Latest commit does not have any schema in commit metadata

2020-05-14 Thread GitBox
EdwinGuo edited a comment on issue #1630: URL: https://github.com/apache/incubator-hudi/issues/1630#issuecomment-628778573 @bvaradar Thanks for the response. You mean the feature of delete? So what I'm working on is having data writing to storage through hudi with hudi version 0.5.0 but I

[GitHub] [incubator-hudi] EdwinGuo commented on issue #1630: [SUPPORT] Latest commit does not have any schema in commit metadata

2020-05-14 Thread GitBox
EdwinGuo commented on issue #1630: URL: https://github.com/apache/incubator-hudi/issues/1630#issuecomment-628778573 @bvaradar Thanks for the response. You mean the feature of delete? So what I'm working on is having data writing to storage through hudi with hudi version 0.5.0 but I was

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1151: [HUDI-476] Add hudi-examples module

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1151: URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-628775947 @lamber-ken are you able to take this across finish line? @dengziming has something that is very close to a first version.. we can try to land that and then

[jira] [Commented] (HUDI-767) Support transformation when export to Hudi

2020-05-14 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107485#comment-17107485 ] Vinoth Chandar commented on HUDI-767: - yeah agree. we can defer this > Support transformation when

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1094: [WIP] [HUDI-375] Refactor the configure framework of hudi project

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1094: URL: https://github.com/apache/incubator-hudi/pull/1094#issuecomment-628766636 cc @n3nash we need something like this.. but more full fledged with fallback key support etc ... closing and saving for later

[GitHub] [incubator-hudi] vinothchandar closed pull request #1094: [WIP] [HUDI-375] Refactor the configure framework of hudi project

2020-05-14 Thread GitBox
vinothchandar closed pull request #1094: URL: https://github.com/apache/incubator-hudi/pull/1094 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1409: [HUDI-714]Add javadoc and comments to hudi write method link

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1409: URL: https://github.com/apache/incubator-hudi/pull/1409#issuecomment-628765699 @nsivabalan can you please re-review and see this home This is an automated message from the

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1471: [WIP][HUDI-752]Make CompactionAdminClient spark-free

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1471: URL: https://github.com/apache/incubator-hudi/pull/1471#issuecomment-628765061 Closing due to inactivity and we have done few different fixes around this. Please rebase, reopen if its still relevant

[GitHub] [incubator-hudi] vinothchandar closed pull request #1471: [WIP][HUDI-752]Make CompactionAdminClient spark-free

2020-05-14 Thread GitBox
vinothchandar closed pull request #1471: URL: https://github.com/apache/incubator-hudi/pull/1471 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[jira] [Commented] (HUDI-774) Spark to Avro converter incorrectly generates optional fields

2020-05-14 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107473#comment-17107473 ] Vinoth Chandar commented on HUDI-774: - yes [~uditme] is going to look at it as well > Spark to Avro

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1558: URL: https://github.com/apache/incubator-hudi/pull/1558#issuecomment-628761300 I will let @yanghua see this home This is an automated message from the Apache Git Service. To

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1596: [HUDI-863] get decimal properties from derived spark DataType

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1596: URL: https://github.com/apache/incubator-hudi/pull/1596#issuecomment-628755876 @umehrot2 gentle ping :) This is an automated message from the Apache Git Service. To respond to

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1611: [HUDI-705]Add unit test for RollbacksCommand

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1611: URL: https://github.com/apache/incubator-hudi/pull/1611#issuecomment-628755381 @yanghua you can review this as well if possible :) This is an automated message from the Apache

[jira] [Updated] (HUDI-888) NPE when compacting via hudi-cli and providing a compaction props file

2020-05-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-888: Labels: pull-request-available (was: ) > NPE when compacting via hudi-cli and providing a

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1622: [HUDI-888] fix NullPointerException

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1622: URL: https://github.com/apache/incubator-hudi/pull/1622#issuecomment-628751099 @leesf can you please review this one? This is an automated message from the Apache Git Service.

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1616: [HUDI-786] Fixing read beyond inline length in InlineFS

2020-05-14 Thread GitBox
vinothchandar commented on a change in pull request #1616: URL: https://github.com/apache/incubator-hudi/pull/1616#discussion_r425276842 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/inline/InLineFsDataInputStream.java ## @@ -56,24 +56,29 @@ public long

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1566: [HUDI-603]: DeltaStreamer can now fetch schema before every run in continuous mode

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1566: URL: https://github.com/apache/incubator-hudi/pull/1566#issuecomment-628749732 @bvaradar this and #1518 are again related.. Can you take both of these home ? This is an

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1518: [HUDI-723] Register avro schema if infered from SQL transformation

2020-05-14 Thread GitBox
vinothchandar commented on pull request #1518: URL: https://github.com/apache/incubator-hudi/pull/1518#issuecomment-628747799 @bvaradar Could you review this one? It hits close to the transformer/schemaprovider changes, which you are more familiar with

[GitHub] [incubator-hudi] bvaradar commented on issue #1630: [SUPPORT] Latest commit does not have any schema in commit metadata

2020-05-14 Thread GitBox
bvaradar commented on issue #1630: URL: https://github.com/apache/incubator-hudi/issues/1630#issuecomment-628744270 @EdwinGuo : THis feature is available only from 0.5.1 onwards. This is an automated message from the Apache

  1   2   >