[GitHub] [incubator-hudi] lamber-ken commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
lamber-ken commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586335056 The `KafkaUtils#fixKafkaParams` method in `spark-streaming-kafka-0-10_2.11-2.4.4`, and KafkaUtil

[GitHub] [incubator-hudi] adamjoneill edited a comment on issue #1325: presto - querying nested object in parquet file created by hudi

2020-02-14 Thread GitBox
adamjoneill edited a comment on issue #1325: presto - querying nested object in parquet file created by hudi URL: https://github.com/apache/incubator-hudi/issues/1325#issuecomment-586284798 @vinothchandar I've managed to reproduce with a simple spark.parallelize() example.

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1333: [HUDI-589][DOCS] Fix querying_data page

2020-02-14 Thread GitBox
leesf commented on a change in pull request #1333: [HUDI-589][DOCS] Fix querying_data page URL: https://github.com/apache/incubator-hudi/pull/1333#discussion_r379428759 ## File path: docs/_docs/2_3_querying_data.md ## @@ -84,55 +102,53 @@ using the hive session property

[GitHub] [incubator-hudi] adamjoneill edited a comment on issue #1325: presto - querying nested object in parquet file created by hudi

2020-02-14 Thread GitBox
adamjoneill edited a comment on issue #1325: presto - querying nested object in parquet file created by hudi URL: https://github.com/apache/incubator-hudi/issues/1325#issuecomment-586284798 @vinothchandar I've managed to reproduce with a simple spark.parallelize() example.

[GitHub] [incubator-hudi] adamjoneill commented on issue #1325: presto - querying nested object in parquet file created by hudi

2020-02-14 Thread GitBox
adamjoneill commented on issue #1325: presto - querying nested object in parquet file created by hudi URL: https://github.com/apache/incubator-hudi/issues/1325#issuecomment-586284798 @vinothchandar I've managed to reproduce with a simple spark.parallelize() example. test.scala

[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
lamber-ken edited a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586335056 The `KafkaUtils#fixKafkaParams` method in `spark-streaming-kafka-0-10_2.11-2.4.4`, and

[jira] [Created] (HUDI-613) Refactor and enhance the Transformer component

2020-02-14 Thread vinoyang (Jira)
vinoyang created HUDI-613: - Summary: Refactor and enhance the Transformer component Key: HUDI-613 URL: https://issues.apache.org/jira/browse/HUDI-613 Project: Apache Hudi (incubating) Issue Type:

[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
lamber-ken edited a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586285604 That pr is used to check up checkpoint offsets is valid or not. Bad to see use

[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
lamber-ken edited a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586285604 That pr is used to check up checkpoint offsets is valid or not. Bad to see use

[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
lamber-ken edited a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586290053 Because use `0.5.0-incubating` can't solve your problem quickly, so we need to back to

[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
lamber-ken edited a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586335056 The `KafkaUtils#fixKafkaParams` method in `spark-streaming-kafka-0-10_2.11-2.4.4`, and

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1333: [HUDI-589][DOCS] Fix querying_data page

2020-02-14 Thread GitBox
leesf commented on a change in pull request #1333: [HUDI-589][DOCS] Fix querying_data page URL: https://github.com/apache/incubator-hudi/pull/1333#discussion_r379418568 ## File path: docs/_docs/2_3_querying_data.md ## @@ -9,7 +9,7 @@ last_modified_at:

[GitHub] [incubator-hudi] lamber-ken commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
lamber-ken commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586290053 We need to back to `0.5.1-incubating` version, then try to find a valid way to solve it. : )

[GitHub] [incubator-hudi] adamjoneill edited a comment on issue #1325: presto - querying nested object in parquet file created by hudi

2020-02-14 Thread GitBox
adamjoneill edited a comment on issue #1325: presto - querying nested object in parquet file created by hudi URL: https://github.com/apache/incubator-hudi/issues/1325#issuecomment-586284798 @vinothchandar I've managed to reproduce with a simple spark.parallelize() example.

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1332: [HUDI -409] Match header and footer block length to improve corrupted block detection

2020-02-14 Thread GitBox
n3nash commented on a change in pull request #1332: [HUDI -409] Match header and footer block length to improve corrupted block detection URL: https://github.com/apache/incubator-hudi/pull/1332#discussion_r379557708 ## File path:

[GitHub] [incubator-hudi] n3nash edited a comment on issue #1242: [HUDI-544] Adjust the read and write path of archive

2020-02-14 Thread GitBox
n3nash edited a comment on issue #1242: [HUDI-544] Adjust the read and write path of archive URL: https://github.com/apache/incubator-hudi/pull/1242#issuecomment-586394357 @hddong please take a look at the last comment and squash all your commits please

[GitHub] [incubator-hudi] n3nash commented on issue #1242: [HUDI-544] Adjust the read and write path of archive

2020-02-14 Thread GitBox
n3nash commented on issue #1242: [HUDI-544] Adjust the read and write path of archive URL: https://github.com/apache/incubator-hudi/pull/1242#issuecomment-586394357 @hddong please take a look at the last comment This is an

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1332: [HUDI -409] Match header and footer block length to improve corrupted block detection

2020-02-14 Thread GitBox
n3nash commented on a change in pull request #1332: [HUDI -409] Match header and footer block length to improve corrupted block detection URL: https://github.com/apache/incubator-hudi/pull/1332#discussion_r379556420 ## File path:

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1320: [HUDI-571] Add min/max headers on archived files

2020-02-14 Thread GitBox
n3nash commented on a change in pull request #1320: [HUDI-571] Add min/max headers on archived files URL: https://github.com/apache/incubator-hudi/pull/1320#discussion_r379559714 ## File path:

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1320: [HUDI-571] Add min/max headers on archived files

2020-02-14 Thread GitBox
n3nash commented on a change in pull request #1320: [HUDI-571] Add min/max headers on archived files URL: https://github.com/apache/incubator-hudi/pull/1320#discussion_r379559545 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieLogBlock.java

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1242: [HUDI-544] Adjust the read and write path of archive

2020-02-14 Thread GitBox
n3nash commented on a change in pull request #1242: [HUDI-544] Adjust the read and write path of archive URL: https://github.com/apache/incubator-hudi/pull/1242#discussion_r379560656 ## File path: hudi-cli/src/main/java/org/apache/hudi/cli/commands/ArchivedCommitsCommand.java

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1320: [HUDI-571] Add min/max headers on archived files

2020-02-14 Thread GitBox
n3nash commented on a change in pull request #1320: [HUDI-571] Add min/max headers on archived files URL: https://github.com/apache/incubator-hudi/pull/1320#discussion_r379558732 ## File path: hudi-client/src/main/java/org/apache/hudi/io/HoodieCommitArchiveLog.java ## @@

[GitHub] [incubator-hudi] amitsingh-10 commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
amitsingh-10 commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586408479 Didn't help. I had already tried, but I tried it again with the `0.5.1-incubating` version.

[incubator-hudi] branch master updated: [HUDI-571] Add show archived compaction(s) to CLI

2020-02-14 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/master by this push: new 20ed251 [HUDI-571] Add show archived

[GitHub] [incubator-hudi] n3nash merged pull request #1312: [HUDI-571] Add "compactions show archived" command to CLI

2020-02-14 Thread GitBox
n3nash merged pull request #1312: [HUDI-571] Add "compactions show archived" command to CLI URL: https://github.com/apache/incubator-hudi/pull/1312 This is an automated message from the Apache Git Service. To respond to the

[GitHub] [incubator-hudi] bhasudha commented on issue #1325: presto - querying nested object in parquet file created by hudi

2020-02-14 Thread GitBox
bhasudha commented on issue #1325: presto - querying nested object in parquet file created by hudi URL: https://github.com/apache/incubator-hudi/issues/1325#issuecomment-586453479 @adamjoneill apologies for the delayed response. Havent gotten a chance to look at this thread. Let me

[GitHub] [incubator-hudi] bwu2 commented on issue #1328: Hudi upsert hangs

2020-02-14 Thread GitBox
bwu2 commented on issue #1328: Hudi upsert hangs URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-586457085 See: https://gist.github.com/bwu2/e432a42f51519f27197f4785af3e1abf This is an automated message

[GitHub] [incubator-hudi] vinothchandar commented on issue #1325: presto - querying nested object in parquet file created by hudi

2020-02-14 Thread GitBox
vinothchandar commented on issue #1325: presto - querying nested object in parquet file created by hudi URL: https://github.com/apache/incubator-hudi/issues/1325#issuecomment-586450568 Thanks @adamjoneill let me try to reproduce as well and see whats going on tonight.

[GitHub] [incubator-hudi] vinothchandar commented on issue #1328: Hudi upsert hangs

2020-02-14 Thread GitBox
vinothchandar commented on issue #1328: Hudi upsert hangs URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-586451527 Even #800 is a reasonable workload.. I don't understand what's going on here .. Its just a single file being versioned.. same as the next two commits,

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1333: [HUDI-589][DOCS] Fix querying_data page

2020-02-14 Thread GitBox
vinothchandar commented on a change in pull request #1333: [HUDI-589][DOCS] Fix querying_data page URL: https://github.com/apache/incubator-hudi/pull/1333#discussion_r379646458 ## File path: docs/_docs/2_3_querying_data.md ## @@ -9,7 +9,7 @@ last_modified_at:

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1333: [HUDI-589][DOCS] Fix querying_data page

2020-02-14 Thread GitBox
vinothchandar commented on a change in pull request #1333: [HUDI-589][DOCS] Fix querying_data page URL: https://github.com/apache/incubator-hudi/pull/1333#discussion_r379646458 ## File path: docs/_docs/2_3_querying_data.md ## @@ -9,7 +9,7 @@ last_modified_at:

[GitHub] [incubator-hudi] popart commented on issue #1329: [SUPPORT] Presto cannot query non-partitioned table

2020-02-14 Thread GitBox
popart commented on issue #1329: [SUPPORT] Presto cannot query non-partitioned table URL: https://github.com/apache/incubator-hudi/issues/1329#issuecomment-586509050 HI Bhavani! Thank you for taking a look. I filed https://issues.apache.org/jira/browse/HUDI-614. Correct the Presto version

[GitHub] [incubator-hudi] popart edited a comment on issue #1329: [SUPPORT] Presto cannot query non-partitioned table

2020-02-14 Thread GitBox
popart edited a comment on issue #1329: [SUPPORT] Presto cannot query non-partitioned table URL: https://github.com/apache/incubator-hudi/issues/1329#issuecomment-586509050 HI Bhavani! Thank you for taking a look. I filed https://issues.apache.org/jira/browse/HUDI-614. Correct the

[GitHub] [incubator-hudi] ramachandranms commented on a change in pull request #1332: [HUDI -409] Match header and footer block length to improve corrupted block detection

2020-02-14 Thread GitBox
ramachandranms commented on a change in pull request #1332: [HUDI -409] Match header and footer block length to improve corrupted block detection URL: https://github.com/apache/incubator-hudi/pull/1332#discussion_r379689535 ## File path:

[jira] [Created] (HUDI-614) .hoodie_partition_metadata created for non-partitioned table

2020-02-14 Thread Andrew Wong (Jira)
Andrew Wong created HUDI-614: Summary: .hoodie_partition_metadata created for non-partitioned table Key: HUDI-614 URL: https://issues.apache.org/jira/browse/HUDI-614 Project: Apache Hudi (incubating)

[GitHub] [incubator-hudi] lamber-ken commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
lamber-ken commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586520820 I test `0.5.1-incubating` and set auto.offset.reset=earliest, it works well in my local env.

[GitHub] [incubator-hudi] leesf merged pull request #1336: [MINOR] Fix some typos

2020-02-14 Thread GitBox
leesf merged pull request #1336: [MINOR] Fix some typos URL: https://github.com/apache/incubator-hudi/pull/1336 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[incubator-hudi] branch master updated: [MINOR] Fix some typos

2020-02-14 Thread leesf
This is an automated email from the ASF dual-hosted git repository. leesf pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/master by this push: new aaa6cf9 [MINOR] Fix some typos aaa6cf9

[GitHub] [incubator-hudi] wangxianghu opened a new pull request #1336: [MINOR] Fix some typos

2020-02-14 Thread GitBox
wangxianghu opened a new pull request #1336: [MINOR] Fix some typos URL: https://github.com/apache/incubator-hudi/pull/1336 ## What is the purpose of the pull request *Fix some typos* ## Brief change log *Fix some typos* ## Verify this pull request This

[GitHub] [incubator-hudi] nsivabalan commented on issue #1165: [HUDI-76] Add CSV Source support for Hudi Delta Streamer

2020-02-14 Thread GitBox
nsivabalan commented on issue #1165: [HUDI-76] Add CSV Source support for Hudi Delta Streamer URL: https://github.com/apache/incubator-hudi/pull/1165#issuecomment-586545073 Sure. go ahead. I also plan to review it sometime. but will let you be the primary reviewer.

[GitHub] [incubator-hudi] smarthi opened a new pull request #1337: [MINOR] Code Cleanup, remove redundant code

2020-02-14 Thread GitBox
smarthi opened a new pull request #1337: [MINOR] Code Cleanup, remove redundant code URL: https://github.com/apache/incubator-hudi/pull/1337 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a

[GitHub] [incubator-hudi] lamber-ken commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
lamber-ken commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586525355 From the stackstrace, the application will read three partition, can you check the range is

[GitHub] [incubator-hudi] lamber-ken commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
lamber-ken commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586285604 That pr is used to check up checkpoint offsets is valid or not. Bad to see use `0.5.0-incubating`

[GitHub] [incubator-hudi] lamber-ken removed a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
lamber-ken removed a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586290053 Because use `0.5.0-incubating` can't solve your problem quickly, so we need to back to

[GitHub] [incubator-hudi] amitsingh-10 commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
amitsingh-10 commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586330008 The Kafka version is `2.1.0-cp2`. Also is there any PR/issue where I can better understand why

[GitHub] [incubator-hudi] amitsingh-10 edited a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
amitsingh-10 edited a comment on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586330008 The Kafka version is `2.1.0-cp2`. Also is there any PR/issue where I can better understand

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #189

2020-02-14 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.25 KB...] /home/jenkins/tools/maven/apache-maven-3.5.4/boot: plexus-classworlds-2.5.2.jar

[GitHub] [incubator-hudi] jinshuangxian commented on issue #954: org.apache.hudi.org.apache.hadoop_hive.metastore.api.NoSuchObjectException: table not found

2020-02-14 Thread GitBox
jinshuangxian commented on issue #954: org.apache.hudi.org.apache.hadoop_hive.metastore.api.NoSuchObjectException: table not found URL: https://github.com/apache/incubator-hudi/issues/954#issuecomment-586161952 > @gfn9cho you are right Glue Catalog does not support Primary Key. Its not

[GitHub] [incubator-hudi] amitsingh-10 opened a new issue #1335: [SUPPORT] HoodieDeltaStreamer offset reset not working

2020-02-14 Thread GitBox
amitsingh-10 opened a new issue #1335: [SUPPORT] HoodieDeltaStreamer offset reset not working URL: https://github.com/apache/incubator-hudi/issues/1335 **Describe the problem you faced** I am trying to create a Hoodie table using `HoodieDeltaStreamer` from Kafka Avro topic. Setting

[GitHub] [incubator-hudi] pratyakshsharma commented on issue #1165: [HUDI-76] Add CSV Source support for Hudi Delta Streamer

2020-02-14 Thread GitBox
pratyakshsharma commented on issue #1165: [HUDI-76] Add CSV Source support for Hudi Delta Streamer URL: https://github.com/apache/incubator-hudi/pull/1165#issuecomment-586154037 @vinothchandar will review it by EOD today. :)

[GitHub] [incubator-hudi] amitsingh-10 commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
amitsingh-10 commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586213714 Thanks for the reply @lamber-ken. I tried working with `0.5.0-incubating` however I am getting the

[GitHub] [incubator-hudi] lamber-ken commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read

2020-02-14 Thread GitBox
lamber-ken commented on issue #1335: [SUPPORT] HoodieDeltaStreamer Kafka offset reset config not being read URL: https://github.com/apache/incubator-hudi/issues/1335#issuecomment-586170746 hi @amitsingh-10, `0.5.1-incubating` built with spark-2.4.4, as you reported,