[jira] [Updated] (HUDI-1392) lose partition info when using spark parameter "basePath"

2020-11-10 Thread steven zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] steven zhang updated HUDI-1392: --- Description: Reproduce the issue with below steps:         set 

[jira] [Commented] (HUDI-432) Benchmark HFile for scan vs seek

2020-11-10 Thread Song Jun (Jira)
[ https://issues.apache.org/jira/browse/HUDI-432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229809#comment-17229809 ] Song Jun commented on HUDI-432: --- Have we decided to use HFile to store index? what's the next plan, thanks~

[jira] [Created] (HUDI-1393) Add compaction action in archive command

2020-11-10 Thread hong dongdong (Jira)
hong dongdong created HUDI-1393: --- Summary: Add compaction action in archive command Key: HUDI-1393 URL: https://issues.apache.org/jira/browse/HUDI-1393 Project: Apache Hudi Issue Type: Bug

[jira] [Created] (HUDI-1392) lose partition info when using spark parameter "basePath"

2020-11-10 Thread steven zhang (Jira)
steven zhang created HUDI-1392: -- Summary: lose partition info when using spark parameter "basePath" Key: HUDI-1392 URL: https://issues.apache.org/jira/browse/HUDI-1392 Project: Apache Hudi

[GitHub] [hudi] codecov-io edited a comment on pull request #2192: [HUDI-1343] Add standard schema postprocessor which would rewrite the schema using spark-avro conversion

2020-11-10 Thread GitBox
codecov-io edited a comment on pull request #2192: URL: https://github.com/apache/hudi/pull/2192#issuecomment-717655063 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2192?src=pr=h1) Report > Merging [#2192](https://codecov.io/gh/apache/hudi/pull/2192?src=pr=desc) (f82b423) into

[jira] [Commented] (HUDI-1391) Added tools: support to obtain multiple data of different schema types from a single topic

2020-11-10 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229689#comment-17229689 ] liujinhui commented on HUDI-1391: - Currently getting CDC data from Kafka will encounter data with

[jira] [Assigned] (HUDI-1391) Added tools: support to obtain multiple data of different schema types from a single topic

2020-11-10 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui reassigned HUDI-1391: --- Assignee: liujinhui > Added tools: support to obtain multiple data of different schema types from a

[jira] [Updated] (HUDI-1391) Added tools: support to obtain multiple data of different schema types from a single topic

2020-11-10 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-1391: Status: In Progress (was: Open) > Added tools: support to obtain multiple data of different schema types

[jira] [Created] (HUDI-1391) Added tools: support to obtain multiple data of different schema types from a single topic

2020-11-10 Thread liujinhui (Jira)
liujinhui created HUDI-1391: --- Summary: Added tools: support to obtain multiple data of different schema types from a single topic Key: HUDI-1391 URL: https://issues.apache.org/jira/browse/HUDI-1391

[jira] [Updated] (HUDI-1391) Added tools: support to obtain multiple data of different schema types from a single topic

2020-11-10 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-1391: Status: Open (was: New) > Added tools: support to obtain multiple data of different schema types from a >

[GitHub] [hudi] codecov-io edited a comment on pull request #2227: [HUDI-1367] Make delastreamer transition from dfsSouce to kafkasouce

2020-11-10 Thread GitBox
codecov-io edited a comment on pull request #2227: URL: https://github.com/apache/hudi/pull/2227#issuecomment-720892015 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2227?src=pr=h1) Report > Merging [#2227](https://codecov.io/gh/apache/hudi/pull/2227?src=pr=desc) (d3fe5e0) into

[GitHub] [hudi] liujinhui1994 opened a new pull request #2242: [HUDI-1366] Make deltasteamer support exporting data from hdfs to hudi

2020-11-10 Thread GitBox
liujinhui1994 opened a new pull request #2242: URL: https://github.com/apache/hudi/pull/2242 Abstract DFSPathSelector ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.*

[jira] [Commented] (HUDI-1377) clean duplicate code in HoodieSparkSqlWriter

2020-11-10 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229629#comment-17229629 ] wangxianghu commented on HUDI-1377: --- [~wangshikai] please assign this ticket to yourself,then we can

[jira] [Commented] (HUDI-1377) clean duplicate code in HoodieSparkSqlWriter

2020-11-10 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229626#comment-17229626 ] wangxianghu commented on HUDI-1377: --- done via master branch : 430d4b428e7c5b325c7414a187f9cda158c2758a

[jira] [Updated] (HUDI-1377) clean duplicate code in HoodieSparkSqlWriter

2020-11-10 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxianghu updated HUDI-1377: -- Status: Open (was: New) > clean duplicate code in HoodieSparkSqlWriter >

[GitHub] [hudi] vinothchandar commented on a change in pull request #2136: [HUDI-37] Persist the HoodieIndex type in the hoodie.properties file

2020-11-10 Thread GitBox
vinothchandar commented on a change in pull request #2136: URL: https://github.com/apache/hudi/pull/2136#discussion_r520944072 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java ## @@ -85,4 +91,61 @@ public static

[GitHub] [hudi] vinothchandar commented on pull request #2106: [HUDI-1284] preCombine all HoodieRecords and update all fields according to orderingVal

2020-11-10 Thread GitBox
vinothchandar commented on pull request #2106: URL: https://github.com/apache/hudi/pull/2106#issuecomment-725032426 @Karl-WangSK any updates on this? Happy to help with any open ended issues here This is an automated

[GitHub] [hudi] vinothchandar commented on a change in pull request #2216: [HUDI-1357] Added a check to ensure no records are lost during updates.

2020-11-10 Thread GitBox
vinothchandar commented on a change in pull request #2216: URL: https://github.com/apache/hudi/pull/2216#discussion_r520939276 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieWriteStat.java ## @@ -49,6 +49,12 @@ */ private String prevCommit;

[jira] [Updated] (HUDI-60) [UMBRELLA] Support Apache Beam for incremental tailing

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-60: --- Labels: gsoc gsoc2021 mentor (was: gsoc2021 mentor) > [UMBRELLA] Support Apache Beam for incremental tailing >

[jira] [Updated] (HUDI-1390) [UMBRELLA] Support schema inference for unstructured data

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1390: - Labels: gsoc gsoc2021 mentor (was: gsoc2021 mentor) > [UMBRELLA] Support schema inference for

[jira] [Updated] (HUDI-1385) [UMBRELLA] Improve source ingestion support in DeltaStreamer

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1385: - Labels: gsoc gsoc2021 mentor (was: gsoc2021 mentor) > [UMBRELLA] Improve source ingestion support in

[jira] [Updated] (HUDI-1237) [UMBRELLA] Checkstyle, formatting, warnings, spotless

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1237: - Labels: gsoc gsoc2021 mentor (was: gsoc2021 mentor) > [UMBRELLA] Checkstyle, formatting, warnings,

[jira] [Updated] (HUDI-1387) [UMBRELLA] Support Apache Calcite for querying Hudi datasets

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1387: - Labels: gsoc gsoc2021 mentor (was: gsoc2021 mentor) > [UMBRELLA] Support Apache Calcite for querying

[jira] [Updated] (HUDI-1389) [UMBRELLA] Survey indexing technique for better query performance

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1389: - Labels: gsoc gsoc2021 mentor (was: gsoc2021 mentor) > [UMBRELLA] Survey indexing technique for better

[jira] [Updated] (HUDI-1388) [UMBRELLA] Improve CLI features and usabilities

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1388: - Labels: gsoc gsoc2021 mentor (was: gsoc2021 mentor) > [UMBRELLA] Improve CLI features and usabilities >

[jira] [Updated] (HUDI-60) [UMBRELLA] Support Apache Beam for incremental tailing

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-60: --- Summary: [UMBRELLA] Support Apache Beam for incremental tailing (was: [UMBRELLA] Beam IO module to support

[jira] [Updated] (HUDI-1237) [UMBRELLA] Checkstyle, formatting, warnings, spotless

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1237: - Component/s: (was: Testing) Code Cleanup > [UMBRELLA] Checkstyle, formatting,

[jira] [Updated] (HUDI-1389) [UMBRELLA] Survey indexing technique for better query performance

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1389: - Component/s: Performance > [UMBRELLA] Survey indexing technique for better query performance >

[jira] [Updated] (HUDI-1388) [UMBRELLA] Improve CLI features and usabilities

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1388: - Component/s: Usability > [UMBRELLA] Improve CLI features and usabilities >

[jira] [Created] (HUDI-1390) [UMBRELLA] Support schema inference for unstructured data

2020-11-10 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-1390: Summary: [UMBRELLA] Support schema inference for unstructured data Key: HUDI-1390 URL: https://issues.apache.org/jira/browse/HUDI-1390 Project: Apache Hudi Issue

[jira] [Created] (HUDI-1389) [UMBRELLA] Survey indexing technique for better query performance

2020-11-10 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-1389: Summary: [UMBRELLA] Survey indexing technique for better query performance Key: HUDI-1389 URL: https://issues.apache.org/jira/browse/HUDI-1389 Project: Apache Hudi

[jira] [Updated] (HUDI-145) Limit the amount of partitions considered for GlobalBloomIndex

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-145: Labels: (was: gsoc2021 mentor) > Limit the amount of partitions considered for GlobalBloomIndex >

[jira] [Updated] (HUDI-74) Improve compaction support in HoodieDeltaStreamer & CLI

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-74?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-74: --- Labels: (was: gsoc2021 mentor) > Improve compaction support in HoodieDeltaStreamer & CLI >

[jira] [Updated] (HUDI-388) Support DDL / DML SparkSQL statements which useful for admins

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-388: Labels: (was: gsoc2021 mentor) > Support DDL / DML SparkSQL statements which useful for admins >

[jira] [Updated] (HUDI-67) Tool to convert sequence file based archived commits to log format #224

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-67?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-67: --- Labels: (was: gsoc2021 mentor) > Tool to convert sequence file based archived commits to log format #224 >

[jira] [Updated] (HUDI-1388) [UMBRELLA] Improve CLI features and usabilities

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1388: - Labels: gsoc2021 mentor (was: ) > [UMBRELLA] Improve CLI features and usabilities >

[jira] [Created] (HUDI-1388) [UMBRELLA] Improve CLI features and usabilities

2020-11-10 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-1388: Summary: [UMBRELLA] Improve CLI features and usabilities Key: HUDI-1388 URL: https://issues.apache.org/jira/browse/HUDI-1388 Project: Apache Hudi Issue Type:

[jira] [Commented] (HUDI-1387) [UMBRELLA] Support Apache Calcite for querying Hudi datasets

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229508#comment-17229508 ] Raymond Xu commented on HUDI-1387: -- [~vinoth] Made this under presto integration component. Shall we

[jira] [Created] (HUDI-1387) [UMBRELLA] Support Apache Calcite for querying Hudi datasets

2020-11-10 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-1387: Summary: [UMBRELLA] Support Apache Calcite for querying Hudi datasets Key: HUDI-1387 URL: https://issues.apache.org/jira/browse/HUDI-1387 Project: Apache Hudi

[jira] [Updated] (HUDI-60) [UMBRELLA] Beam IO module to support incremental tailing of Hoodie Hive/Spark tables

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-60: --- Summary: [UMBRELLA] Beam IO module to support incremental tailing of Hoodie Hive/Spark tables (was: Beam IO

[jira] [Updated] (HUDI-60) Beam IO module to support incremental tailing of Hoodie Hive/Spark tables #8

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-60: --- Description: (More details to be added) (was: https://github.com/uber/hudi/issues/8) > Beam IO module to

[jira] [Updated] (HUDI-96) Use Command line options instead of positional arguments when launching spark applications from various CLI commands

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-96?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-96: --- Labels: newbie pull-request-available (was: gsoc2021 mentor newbie pull-request-available) > Use Command line

[jira] [Created] (HUDI-1386) AWS kinesis data source for DeltaStreamer

2020-11-10 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-1386: Summary: AWS kinesis data source for DeltaStreamer Key: HUDI-1386 URL: https://issues.apache.org/jira/browse/HUDI-1386 Project: Apache Hudi Issue Type: New Feature

[jira] [Updated] (HUDI-74) Improve compaction support in HoodieDeltaStreamer & CLI

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-74?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-74: --- Component/s: CLI > Improve compaction support in HoodieDeltaStreamer & CLI >

[jira] [Updated] (HUDI-735) Improve deltastreamer error message when case mismatch of commandline arguments.

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-735: Labels: (was: bug-bash-0.6.0) > Improve deltastreamer error message when case mismatch of commandline >

[jira] [Updated] (HUDI-735) Improve deltastreamer error message when case mismatch of commandline arguments.

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-735: Labels: bug-bash-0.6.0 (was: bug-bash-0.6.0 gsoc2021 mentor) > Improve deltastreamer error message when

[jira] [Updated] (HUDI-735) Improve deltastreamer error message when case mismatch of commandline arguments.

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-735: Component/s: (was: Utilities) Code Cleanup > Improve deltastreamer error message when

[jira] [Updated] (HUDI-73) Support vanilla Avro Kafka Source in HoodieDeltaStreamer

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-73?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-73: --- Labels: pull-request-available (was: gsoc2021 mentor pull-request-available) > Support vanilla Avro Kafka

[jira] [Updated] (HUDI-246) Apache Pulsar data source for Hudi

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-246: Labels: (was: gsoc2021 mentor) > Apache Pulsar data source for Hudi > -- >

[jira] [Updated] (HUDI-1385) [UMBRELLA] Improve source ingestion support in DeltaStreamer

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1385: - Summary: [UMBRELLA] Improve source ingestion support in DeltaStreamer (was: [UMBRELLA] Improve source

[jira] [Updated] (HUDI-1290) Implement Debezium avro source for Delta Streamer

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1290: - Labels: (was: gsoc2021 mentor) > Implement Debezium avro source for Delta Streamer >

[jira] [Updated] (HUDI-488) Refactor Source classes in hudi-utilities

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-488: Labels: (was: gsoc2021 mentor) > Refactor Source classes in hudi-utilities >

[jira] [Created] (HUDI-1385) [UMBRELLA] Improve source support in DeltaStreamer

2020-11-10 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-1385: Summary: [UMBRELLA] Improve source support in DeltaStreamer Key: HUDI-1385 URL: https://issues.apache.org/jira/browse/HUDI-1385 Project: Apache Hudi Issue Type:

[jira] [Updated] (HUDI-60) Beam IO module to support incremental tailing of Hoodie Hive/Spark tables #8

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-60: --- Labels: gsoc2021 mentor (was: ) > Beam IO module to support incremental tailing of Hoodie Hive/Spark tables #8

[jira] [Updated] (HUDI-534) Explore a new way to fix import order

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-534: Labels: (was: gsoc2021 mentor) > Explore a new way to fix import order >

[jira] [Updated] (HUDI-304) Bring back spotless plugin

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-304: Labels: bug-bash-0.6.0 help-wanted pull-request-available (was: bug-bash-0.6.0 gsoc2021 help-wanted mentor

[jira] [Updated] (HUDI-1001) Add implementation to translate source partition paths when doing metadata bootstrap

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1001: - Labels: (was: gsoc2021 mentor) > Add implementation to translate source partition paths when doing

[GitHub] [hudi] kpurella commented on issue #2240: [SUPPORT] Performance Issue : HUDI MOR ,UPSERT Job running forever

2020-11-10 Thread GitBox
kpurella commented on issue #2240: URL: https://github.com/apache/hudi/issues/2240#issuecomment-724917083 @bvaradar Thank you for your quick response. 1) we are not using any ordering as my key is a composite key ( combination 4 attributes). -sure I will give a try with

[jira] [Updated] (HUDI-270) [UMBRELLA] Improve Hudi website UI and documentation

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-270: Labels: (was: gsoc2021 mentor) > [UMBRELLA] Improve Hudi website UI and documentation >

[jira] [Updated] (HUDI-233) Redo log statements using SLF4J

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-233: Labels: pull-request-available (was: gsoc2021 mentor pull-request-available) > Redo log statements using

[jira] [Updated] (HUDI-767) Support transformation when export to Hudi

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-767: Labels: (was: gsoc2021 mentor) > Support transformation when export to Hudi >

[jira] [Updated] (HUDI-904) Segregate metrics configs by reporter type

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-904: Labels: (was: gsoc2021 mentor) > Segregate metrics configs by reporter type >

[jira] [Updated] (HUDI-791) Replace null by Option in Delta Streamer

2020-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-791: Status: Closed (was: Patch Available) > Replace null by Option in Delta Streamer >

[GitHub] [hudi] bvaradar commented on issue #2239: [SUPPORT] NoClassDefFoundError: org/apache/hudi/org/apache/commons/codec/binary/Base64

2020-11-10 Thread GitBox
bvaradar commented on issue #2239: URL: https://github.com/apache/hudi/issues/2239#issuecomment-724875552 @shenbinglife : Are you seeing the same issue with 0.6.0 ? This is an automated message from the Apache Git

[GitHub] [hudi] vinothchandar merged pull request #2235: [HUDI-1377] remove duplicate code in HoodieSparkSqlWriter

2020-11-10 Thread GitBox
vinothchandar merged pull request #2235: URL: https://github.com/apache/hudi/pull/2235 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[hudi] branch master updated (42b6aec -> 430d4b4)

2020-11-10 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 42b6aec [HUDI-1358] Fix Memory Leak in HoodieLogFormatWriter (#2217) add 430d4b4 [HUDI-1377] remove duplicate

[GitHub] [hudi] bvaradar commented on issue #2238: [SUPPORT] _hoodie_is_deleted support for Spark Datasource API in hudi 0.5.2-incubating

2020-11-10 Thread GitBox
bvaradar commented on issue #2238: URL: https://github.com/apache/hudi/issues/2238#issuecomment-724872746 @nsivabalan : Can you please take a look at this ? This is an automated message from the Apache Git Service. To

[GitHub] [hudi] bvaradar commented on issue #2240: [SUPPORT] Performance Issue : HUDI MOR ,UPSERT Job running forever

2020-11-10 Thread GitBox
bvaradar commented on issue #2240: URL: https://github.com/apache/hudi/issues/2240#issuecomment-724872433 Does your record key has any natural ordering ? If not, you can disable "hoodie.bloom.index.prune.by.ranges=false". Also, you are using GLOBAL_BLOOM which is expected to scan

[GitHub] [hudi] codecov-io edited a comment on pull request #2241: [HUDI-1384] Decoupling hive jdbc dependency when HIVE_USE_JDBC_OPT_KE…

2020-11-10 Thread GitBox
codecov-io edited a comment on pull request #2241: URL: https://github.com/apache/hudi/pull/2241#issuecomment-724710949 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2241?src=pr=h1) Report > Merging [#2241](https://codecov.io/gh/apache/hudi/pull/2241?src=pr=desc) (519f16e) into

[GitHub] [hudi] codecov-io edited a comment on pull request #2241: [HUDI-1384] Decoupling hive jdbc dependency when HIVE_USE_JDBC_OPT_KE…

2020-11-10 Thread GitBox
codecov-io edited a comment on pull request #2241: URL: https://github.com/apache/hudi/pull/2241#issuecomment-724710949 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2241?src=pr=h1) Report > Merging [#2241](https://codecov.io/gh/apache/hudi/pull/2241?src=pr=desc) (519f16e) into

[GitHub] [hudi] codecov-io commented on pull request #2241: [HUDI-1384] Decoupling hive jdbc dependency when HIVE_USE_JDBC_OPT_KE…

2020-11-10 Thread GitBox
codecov-io commented on pull request #2241: URL: https://github.com/apache/hudi/pull/2241#issuecomment-724710949 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2241?src=pr=h1) Report > Merging [#2241](https://codecov.io/gh/apache/hudi/pull/2241?src=pr=desc) (519f16e) into

[jira] [Commented] (HUDI-791) Replace null by Option in Delta Streamer

2020-11-10 Thread Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229208#comment-17229208 ] Gary Li commented on HUDI-791: -- [~rxu] please close. not sure why but this apache account doesn't have access

[jira] [Updated] (HUDI-1384) Decoupling hive jdbc dependency when HIVE_USE_JDBC_OPT_KEY set false

2020-11-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1384: - Labels: pull-request-available (was: ) > Decoupling hive jdbc dependency when

[GitHub] [hudi] pengzhiwei2018 opened a new pull request #2241: [HUDI-1384] Decoupling hive jdbc dependency when HIVE_USE_JDBC_OPT_KE…

2020-11-10 Thread GitBox
pengzhiwei2018 opened a new pull request #2241: URL: https://github.com/apache/hudi/pull/2241 …Y set false ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ##

[jira] [Created] (HUDI-1384) Decoupling hive jdbc dependency when HIVE_USE_JDBC_OPT_KEY set false

2020-11-10 Thread pengzhiwei (Jira)
pengzhiwei created HUDI-1384: Summary: Decoupling hive jdbc dependency when HIVE_USE_JDBC_OPT_KEY set false Key: HUDI-1384 URL: https://issues.apache.org/jira/browse/HUDI-1384 Project: Apache Hudi

[jira] [Assigned] (HUDI-1383) Incorrect partitions getting hive synced

2020-11-10 Thread linshan-ma (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] linshan-ma reassigned HUDI-1383: Assignee: linshan-ma > Incorrect partitions getting hive synced >

[GitHub] [hudi] kpurella opened a new issue #2240: [SUPPORT] HUDI MOR ,UPSERT Job running forever

2020-11-10 Thread GitBox
kpurella opened a new issue #2240: URL: https://github.com/apache/hudi/issues/2240 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://cwiki.apache.org/confluence/display/HUDI/FAQ)? yes - Join the mailing list to engage in conversations and get faster