[jira] [Commented] (HUDI-254) Provide mechanism for installing hudi-spark-bundle onto an existing spark installation

2019-09-17 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931817#comment-16931817 ] Udit Mehrotra commented on HUDI-254: [~vinoth] On EMR's side we have the same findings. *a + b + c +d*

[jira] [Commented] (HUDI-260) Hudi Spark Bundle does not work when passed in extraClassPath option

2019-09-17 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931918#comment-16931918 ] Udit Mehrotra commented on HUDI-260: Thanks for creating checking on this issue. Here is the exception

[jira] [Updated] (HUDI-268) Allow parquet/avro versions upgrading in Hudi

2019-09-20 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-268: --- Description: As of now Hudi depends on *Parquet* *1.8.1* and *Avro* *1.7.7* which might work fine for

[jira] [Created] (HUDI-268) Allow parquet/avro versions upgrading in Hudi

2019-09-20 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-268: -- Summary: Allow parquet/avro versions upgrading in Hudi Key: HUDI-268 URL: https://issues.apache.org/jira/browse/HUDI-268 Project: Apache Hudi (incubating) Issue

[jira] [Updated] (HUDI-268) Allow parquet/avro versions upgrading in Hudi

2019-09-20 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-268: --- Description: As of now Hudi depends on *Parquet* *1.8.1* and *Avro* *1.7.7* which might work fine for

[jira] [Updated] (HUDI-268) Allow parquet/avro versions upgrading in Hudi

2019-09-20 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-268: --- Description: As of now Hudi depends on *Parquet* *1.8.1* and *Avro* *1.7.7* which might work fine for

[jira] [Updated] (HUDI-268) Allow parquet/avro versions upgrading in Hudi

2019-09-20 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-268: --- Description: As of now Hudi depends on *Parquet* *1.8.1* and *Avro* *1.7.7* which might work fine for

[jira] [Assigned] (HUDI-268) Allow parquet/avro versions upgrading in Hudi

2019-09-20 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-268: -- Assignee: Udit Mehrotra > Allow parquet/avro versions upgrading in Hudi >

[jira] [Updated] (HUDI-268) Allow parquet/avro versions upgrading in Hudi

2019-09-20 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-268: --- Description: As of now Hudi depends on *Parquet* *1.8.1* and *Avro* *1.7.7* which might work fine for

[jira] [Updated] (HUDI-268) Allow parquet/avro versions upgrading in Hudi

2019-09-20 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-268: --- Component/s: (was: Presto Integration) > Allow parquet/avro versions upgrading in Hudi >

[jira] [Commented] (HUDI-268) Allow parquet/avro versions upgrading in Hudi

2019-09-20 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934817#comment-16934817 ] Udit Mehrotra commented on HUDI-268: Created PR for the same

[jira] [Commented] (HUDI-260) Hudi Spark Bundle does not work when passed in extraClassPath option

2019-09-19 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933672#comment-16933672 ] Udit Mehrotra commented on HUDI-260: Can you give the following a shot: spark.driver.extraClassPath

[jira] [Commented] (HUDI-83) Support for timestamp datatype in Hudi

2019-09-19 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933660#comment-16933660 ] Udit Mehrotra commented on HUDI-83: --- Ack. Will we focusing on

[jira] [Created] (HUDI-281) HiveSync failure through Spark when useJdbc is set to false

2019-09-25 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-281: -- Summary: HiveSync failure through Spark when useJdbc is set to false Key: HUDI-281 URL: https://issues.apache.org/jira/browse/HUDI-281 Project: Apache Hudi (incubating)

[jira] [Updated] (HUDI-281) HiveSync failure through Spark when useJdbc is set to false

2019-09-25 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-281: --- Description: Table creation with Hive sync through Spark fails, when I set *useJdbc* to *false*.

[jira] [Updated] (HUDI-268) Allow parquet/avro versions upgrading in Hudi

2019-10-16 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-268: --- Status: Closed (was: Patch Available) > Allow parquet/avro versions upgrading in Hudi >

[jira] [Reopened] (HUDI-268) Allow parquet/avro versions upgrading in Hudi

2019-10-16 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reopened HUDI-268: > Allow parquet/avro versions upgrading in Hudi > - > >

[jira] [Resolved] (HUDI-268) Allow parquet/avro versions upgrading in Hudi

2019-10-16 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra resolved HUDI-268. Fix Version/s: 0.5.0 Resolution: Fixed > Allow parquet/avro versions upgrading in Hudi >

[jira] [Created] (HUDI-303) Avro schema case sensitivity testing

2019-10-15 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-303: -- Summary: Avro schema case sensitivity testing Key: HUDI-303 URL: https://issues.apache.org/jira/browse/HUDI-303 Project: Apache Hudi (incubating) Issue Type:

[jira] [Assigned] (HUDI-306) Get to Hudi to support AWS Glue Catalog and other Hive Metastore implementations

2019-10-17 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-306: -- Assignee: Udit Mehrotra > Get to Hudi to support AWS Glue Catalog and other Hive Metastore >

[jira] [Created] (HUDI-306) Get to Hudi to support AWS Glue Catalog and other Hive Metastore implementations

2019-10-17 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-306: -- Summary: Get to Hudi to support AWS Glue Catalog and other Hive Metastore implementations Key: HUDI-306 URL: https://issues.apache.org/jira/browse/HUDI-306 Project:

[jira] [Updated] (HUDI-306) Get to Hudi to support AWS Glue Catalog and other Hive Metastore implementations

2019-10-17 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-306: --- Status: Patch Available (was: In Progress) > Get to Hudi to support AWS Glue Catalog and other Hive

[jira] [Updated] (HUDI-306) Get to Hudi to support AWS Glue Catalog and other Hive Metastore implementations

2019-10-17 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-306: --- Labels: pull-request-available (was: ) > Get to Hudi to support AWS Glue Catalog and other Hive

[jira] [Assigned] (HUDI-298) Upsert MOR table but got a NULL value

2019-10-11 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-298: -- Assignee: Udit Mehrotra > Upsert MOR table but got a NULL value >

[jira] [Commented] (HUDI-298) Upsert MOR table but got a NULL value

2019-10-11 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949708#comment-16949708 ] Udit Mehrotra commented on HUDI-298: This is an issue with how on-the-fly merge is being performed

[jira] [Commented] (HUDI-298) Upsert MOR table but got a NULL value

2019-10-14 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951249#comment-16951249 ] Udit Mehrotra commented on HUDI-298: > Also typically atleast for data ingestion, data schema managed

[jira] [Created] (HUDI-656) Write Performance - Driver spends too much time creating Parquet DataSource after writes

2020-03-04 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-656: -- Summary: Write Performance - Driver spends too much time creating Parquet DataSource after writes Key: HUDI-656 URL: https://issues.apache.org/jira/browse/HUDI-656

[jira] [Assigned] (HUDI-656) Write Performance - Driver spends too much time creating Parquet DataSource after writes

2020-03-04 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-656: -- Assignee: Udit Mehrotra > Write Performance - Driver spends too much time creating Parquet

[jira] [Created] (HUDI-607) Hive sync fails to register tables partitioned by Date Type column

2020-02-12 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-607: -- Summary: Hive sync fails to register tables partitioned by Date Type column Key: HUDI-607 URL: https://issues.apache.org/jira/browse/HUDI-607 Project: Apache Hudi

[jira] [Assigned] (HUDI-607) Hive sync fails to register tables partitioned by Date Type column

2020-02-12 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-607: -- Assignee: Udit Mehrotra > Hive sync fails to register tables partitioned by Date Type column >

[jira] [Created] (HUDI-519) Document the need for Avro dependency shading/relocation for custom payloads

2020-01-10 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-519: -- Summary: Document the need for Avro dependency shading/relocation for custom payloads Key: HUDI-519 URL: https://issues.apache.org/jira/browse/HUDI-519 Project: Apache

[jira] [Assigned] (HUDI-530) Datasource Writer throws error on resolving struct fields

2020-01-13 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-530: -- Assignee: Udit Mehrotra > Datasource Writer throws error on resolving struct fields >

[jira] [Created] (HUDI-516) Avoid need to import spark-avro package when submitting Hudi job in spark

2020-01-09 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-516: -- Summary: Avoid need to import spark-avro package when submitting Hudi job in spark Key: HUDI-516 URL: https://issues.apache.org/jira/browse/HUDI-516 Project: Apache Hudi

[jira] [Created] (HUDI-530) Datasource Writer throws error on resolving struct fields

2020-01-13 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-530: -- Summary: Datasource Writer throws error on resolving struct fields Key: HUDI-530 URL: https://issues.apache.org/jira/browse/HUDI-530 Project: Apache Hudi (incubating)

[jira] [Commented] (HUDI-672) Spark DataSource - Upsert for S3 Hudi dataset with large partitions takes a lot of time in writing

2020-03-06 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17053834#comment-17053834 ] Udit Mehrotra commented on HUDI-672: This issue is duplicated by

[jira] [Created] (HUDI-807) Support for incremental queries for bootstrapped tables

2020-04-17 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-807: -- Summary: Support for incremental queries for bootstrapped tables Key: HUDI-807 URL: https://issues.apache.org/jira/browse/HUDI-807 Project: Apache Hudi (incubating)

[jira] [Created] (HUDI-806) Implement support for bootstrapping via Spark datasource API

2020-04-17 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-806: -- Summary: Implement support for bootstrapping via Spark datasource API Key: HUDI-806 URL: https://issues.apache.org/jira/browse/HUDI-806 Project: Apache Hudi (incubating)

[jira] [Created] (HUDI-808) Support for cleaning source data

2020-04-17 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-808: -- Summary: Support for cleaning source data Key: HUDI-808 URL: https://issues.apache.org/jira/browse/HUDI-808 Project: Apache Hudi (incubating) Issue Type:

[jira] [Updated] (HUDI-83) Map Timestamp type in spark to corresponding Timestamp type in Hive during Hive sync

2020-03-13 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-83?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-83: -- Component/s: Hive Integration > Map Timestamp type in spark to corresponding Timestamp type in Hive

[jira] [Updated] (HUDI-83) Map Timestamp type in spark to corresponding Timestamp type in Hive during Hive sync

2020-03-13 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-83?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-83: -- Summary: Map Timestamp type in spark to corresponding Timestamp type in Hive during Hive sync (was:

[jira] [Commented] (HUDI-83) Support for timestamp datatype in Hudi

2020-03-13 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059099#comment-17059099 ] Udit Mehrotra commented on HUDI-83: --- [~arw357] Timestamp type is supported while writing through spark and

[jira] [Commented] (HUDI-83) Map Timestamp type in spark to corresponding Timestamp type in Hive during Hive sync

2020-03-13 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059103#comment-17059103 ] Udit Mehrotra commented on HUDI-83: --- Updated the Jira title to more accurately reflect what it is for, >

[jira] [Assigned] (HUDI-426) Implement Spark DataSource Support for querying bootstrapped tables

2020-04-02 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-426: -- Assignee: Udit Mehrotra (was: Nicholas Jiang) > Implement Spark DataSource Support for

[jira] [Assigned] (HUDI-620) Hive Sync Integration of bootstrapped table

2020-04-23 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-620: -- Assignee: Udit Mehrotra > Hive Sync Integration of bootstrapped table >

[jira] [Comment Edited] (HUDI-829) Efficiently reading hudi tables through spark-shell

2020-04-22 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090098#comment-17090098 ] Udit Mehrotra edited comment on HUDI-829 at 4/23/20, 12:07 AM: --- You may also

[jira] [Comment Edited] (HUDI-829) Efficiently reading hudi tables through spark-shell

2020-04-22 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090098#comment-17090098 ] Udit Mehrotra edited comment on HUDI-829 at 4/22/20, 11:59 PM: --- You may also

[jira] [Commented] (HUDI-829) Efficiently reading hudi tables through spark-shell

2020-04-22 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090098#comment-17090098 ] Udit Mehrotra commented on HUDI-829: You may also want to look at my implementation of custom relation

[jira] [Commented] (HUDI-829) Efficiently reading hudi tables through spark-shell

2020-04-22 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090095#comment-17090095 ] Udit Mehrotra commented on HUDI-829: [~nishith29] Thanks for creating the ticket. So the issue I was

[jira] [Created] (HUDI-838) Support getting schema from commit metadata for HiveSync

2020-04-24 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-838: -- Summary: Support getting schema from commit metadata for HiveSync Key: HUDI-838 URL: https://issues.apache.org/jira/browse/HUDI-838 Project: Apache Hudi (incubating)

[jira] [Created] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2020-05-10 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-874: -- Summary: Schema evolution does not work with AWS Glue catalog Key: HUDI-874 URL: https://issues.apache.org/jira/browse/HUDI-874 Project: Apache Hudi (incubating)

[jira] [Commented] (HUDI-721) AvroConversionUtils is broken for complex types in 0.6

2020-03-18 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062247#comment-17062247 ] Udit Mehrotra commented on HUDI-721: This could be related to

[jira] [Commented] (HUDI-724) Parallelize GetSmallFiles For Partitions

2020-03-19 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17063033#comment-17063033 ] Udit Mehrotra commented on HUDI-724: Thanks Feichi for putting this out ! [~vinoth] [~vbalaji] Feichi

[jira] [Updated] (HUDI-915) Partition Columns missing in files upserted after Metadata Bootstrap

2020-05-19 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-915: --- Description: This issue happens in when the source data is partitioned using _*hive-style

[jira] [Updated] (HUDI-915) Partition Columns missing in files upserted after Metadata Bootstrap

2020-05-19 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-915: --- Description: This issue happens in when the source data is partitioned using _*hive-style

[jira] [Created] (HUDI-915) Partition Columns missing in files upserted after Metadata Bootstrap

2020-05-19 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-915: -- Summary: Partition Columns missing in files upserted after Metadata Bootstrap Key: HUDI-915 URL: https://issues.apache.org/jira/browse/HUDI-915 Project: Apache Hudi

[jira] [Commented] (HUDI-1312) Query side use of Metadata Table

2020-10-13 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17213289#comment-17213289 ] Udit Mehrotra commented on HUDI-1312: - [~vinoth] I think either me or someone from AWS can work on

[jira] [Assigned] (HUDI-1312) Query side use of Metadata Table

2020-10-13 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-1312: --- Assignee: Udit Mehrotra > Query side use of Metadata Table >

[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2020-10-09 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211437#comment-17211437 ] Udit Mehrotra commented on HUDI-874: This fix is already on emr-6.1.0 release. However its not yet

[jira] [Assigned] (HUDI-1230) Spark-submit for MOR table creation via DataSource API hangs

2020-08-26 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-1230: --- Assignee: Udit Mehrotra > Spark-submit for MOR table creation via DataSource API hangs >

[jira] [Created] (HUDI-1230) Spark-submit for MOR table creation via DataSource API hangs

2020-08-26 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-1230: --- Summary: Spark-submit for MOR table creation via DataSource API hangs Key: HUDI-1230 URL: https://issues.apache.org/jira/browse/HUDI-1230 Project: Apache Hudi

[jira] [Commented] (HUDI-83) Map Timestamp type in spark to corresponding Timestamp type in Hive during Hive sync

2020-09-22 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200473#comment-17200473 ] Udit Mehrotra commented on HUDI-83: --- [~FelixKJose] I did a quick test. To able to sync timestamp column as

[jira] [Commented] (HUDI-83) Map Timestamp type in spark to corresponding Timestamp type in Hive during Hive sync

2020-09-22 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200466#comment-17200466 ] Udit Mehrotra commented on HUDI-83: --- [~FelixKJose] I have not really tried this on EMR 6. Let me test and

[jira] [Resolved] (HUDI-1213) Set Default for the bootstrap config : hoodie.bootstrap.full.input.provider

2020-09-25 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra resolved HUDI-1213. - Resolution: Fixed > Set Default for the bootstrap config : hoodie.bootstrap.full.input.provider >

[jira] [Commented] (HUDI-721) AvroConversionUtils is broken for complex types in 0.6

2020-05-24 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17115685#comment-17115685 ] Udit Mehrotra commented on HUDI-721: [~shivnarayan] this particular issue has already been fixed by

[jira] [Commented] (HUDI-1312) Query side use of Metadata Table

2020-10-28 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222402#comment-17222402 ] Udit Mehrotra commented on HUDI-1312: - [~vinoth] I have some cycles now, and was going to dive deep

[jira] [Commented] (HUDI-1098) Marker file finalizing may block on a data file that was never written

2020-08-03 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17170445#comment-17170445 ] Udit Mehrotra commented on HUDI-1098: - [~vinoth] [~shivnarayan] Actually its not that straight

[jira] [Assigned] (HUDI-1108) Allow parallel listing of dataset partitions for various actions during write

2020-08-06 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-1108: --- Assignee: Ryan Pifer (was: Udit Mehrotra) > Allow parallel listing of dataset partitions

[jira] [Assigned] (HUDI-1108) Allow parallel listing of dataset partitions for various actions during write

2020-08-06 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-1108: --- Assignee: (was: Udit Mehrotra) > Allow parallel listing of dataset partitions for

[jira] [Assigned] (HUDI-1108) Allow parallel listing of dataset partitions for various actions during write

2020-08-06 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-1108: --- Assignee: Udit Mehrotra > Allow parallel listing of dataset partitions for various actions

[jira] [Created] (HUDI-1157) Optimization whether to query Bootstrapped table using HoodieBootstrapRelation vs Sparks Parquet datasource

2020-08-06 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-1157: --- Summary: Optimization whether to query Bootstrapped table using HoodieBootstrapRelation vs Sparks Parquet datasource Key: HUDI-1157 URL:

[jira] [Reopened] (HUDI-427) Implement CLI support for performing bootstrap

2020-08-04 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reopened HUDI-427: > Implement CLI support for performing bootstrap > -- > >

[jira] [Assigned] (HUDI-1158) Optimizations in parallelized listing behaviour for markers and bootstrap source files

2020-08-06 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-1158: --- Assignee: Udit Mehrotra > Optimizations in parallelized listing behaviour for markers and

[jira] [Assigned] (HUDI-1158) Optimizations in parallelized listing behaviour for markers and bootstrap source files

2020-08-06 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-1158: --- Assignee: (was: Udit Mehrotra) > Optimizations in parallelized listing behaviour for

[jira] [Created] (HUDI-1158) Optimizations in parallelized listing behaviour for markers and bootstrap source files

2020-08-06 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-1158: --- Summary: Optimizations in parallelized listing behaviour for markers and bootstrap source files Key: HUDI-1158 URL: https://issues.apache.org/jira/browse/HUDI-1158

[jira] [Created] (HUDI-1174) Hudi changes for bootstrapped tables integration with Presto

2020-08-10 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-1174: --- Summary: Hudi changes for bootstrapped tables integration with Presto Key: HUDI-1174 URL: https://issues.apache.org/jira/browse/HUDI-1174 Project: Apache Hudi

[jira] [Resolved] (HUDI-999) Parallelize listing of Source dataset partitions

2020-08-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra resolved HUDI-999. Resolution: Fixed > Parallelize listing of Source dataset partitions >

[jira] [Resolved] (HUDI-426) Implement Spark DataSource Support for querying bootstrapped tables

2020-08-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra resolved HUDI-426. Resolution: Fixed > Implement Spark DataSource Support for querying bootstrapped tables >

[jira] [Updated] (HUDI-620) Hive Sync Integration of bootstrapped table

2020-08-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-620: --- Status: Closed (was: Patch Available) > Hive Sync Integration of bootstrapped table >

[jira] [Commented] (HUDI-620) Hive Sync Integration of bootstrapped table

2020-08-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175092#comment-17175092 ] Udit Mehrotra commented on HUDI-620: Resolved by https://github.com/apache/hudi/pull/1702/ > Hive Sync

[jira] [Commented] (HUDI-1138) Re-implement marker files via timeline server

2020-07-31 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169028#comment-17169028 ] Udit Mehrotra commented on HUDI-1138: - Another potential performance improvement for listing/deletion

[jira] [Commented] (HUDI-1098) Marker file finalizing may block on a data file that was never written

2020-07-31 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169034#comment-17169034 ] Udit Mehrotra commented on HUDI-1098: - [~shivnarayan] [~vinoth] thanks for prioritizing this issue.

[jira] [Commented] (HUDI-1183) PrestoDB dependency on Apache Hudi

2020-08-12 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176615#comment-17176615 ] Udit Mehrotra commented on HUDI-1183: - Some of dependency conflicts that require shading can be found

[jira] [Assigned] (HUDI-1184) Support updatePartitionPath for HBaseIndex

2020-08-13 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-1184: --- Assignee: Ryan Pifer > Support updatePartitionPath for HBaseIndex >

[jira] [Resolved] (HUDI-427) Implement CLI support for performing bootstrap

2020-08-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra resolved HUDI-427. Resolution: Fixed > Implement CLI support for performing bootstrap >

[jira] [Commented] (HUDI-1079) Cannot upsert on schema with Array of Record with single field

2020-07-09 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155004#comment-17155004 ] Udit Mehrotra commented on HUDI-1079: - Also if you just try the following in Spark: {code:java} val df

[jira] [Commented] (HUDI-1079) Cannot upsert on schema with Array of Record with single field

2020-07-09 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155000#comment-17155000 ] Udit Mehrotra commented on HUDI-1079: - [~tase] I will also look into this sometime this week, but to

[jira] [Commented] (HUDI-1013) Bulk Insert w/o converting to RDD

2020-06-15 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136121#comment-17136121 ] Udit Mehrotra commented on HUDI-1013: - [~shivnarayan] and [~vinoth] Thanks for driving this effort !

[jira] [Created] (HUDI-1054) Address performance issues with finalizing writes on S3

2020-06-25 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-1054: --- Summary: Address performance issues with finalizing writes on S3 Key: HUDI-1054 URL: https://issues.apache.org/jira/browse/HUDI-1054 Project: Apache Hudi

[jira] [Commented] (HUDI-83) Map Timestamp type in spark to corresponding Timestamp type in Hive during Hive sync

2020-06-16 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17137926#comment-17137926 ] Udit Mehrotra commented on HUDI-83: --- [~chenxiang] Looking forward to a PR to understand the work you have

[jira] [Updated] (HUDI-1021) [Bug] Unable to update bootstrapped table using rows from the written bootstrapped table

2020-06-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-1021: Description:   {noformat} Caused by: org.apache.hudi.exception.HoodieUpsertException: Error

[jira] [Created] (HUDI-1021) [Bug] Unable to update bootstrapped table using rows from the written bootstrapped table

2020-06-10 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-1021: --- Summary: [Bug] Unable to update bootstrapped table using rows from the written bootstrapped table Key: HUDI-1021 URL: https://issues.apache.org/jira/browse/HUDI-1021

[jira] [Updated] (HUDI-1021) [Bug] Unable to update bootstrapped table using rows from the written bootstrapped table

2020-06-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-1021: Description: Reproduction Steps:   {code:java} import spark.implicits._ import

[jira] [Updated] (HUDI-1021) [Bug] Unable to update bootstrapped table using rows from the written bootstrapped table

2020-06-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-1021: Description: Reproduction Steps:   {code:java} import spark.implicits._ import

[jira] [Assigned] (HUDI-1021) [Bug] Unable to update bootstrapped table using rows from the written bootstrapped table

2020-06-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-1021: --- Assignee: Balaji Varadarajan > [Bug] Unable to update bootstrapped table using rows from the

[jira] [Created] (HUDI-991) Bootstrap Implementation Bugs

2020-06-03 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-991: -- Summary: Bootstrap Implementation Bugs Key: HUDI-991 URL: https://issues.apache.org/jira/browse/HUDI-991 Project: Apache Hudi Issue Type: Sub-task

[jira] [Updated] (HUDI-992) For hive-style partitioned source data, partition columns synced with Hive will always have String type

2020-06-03 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-992: --- Parent: HUDI-242 Issue Type: Sub-task (was: Bug) > For hive-style partitioned source data,

[jira] [Created] (HUDI-992) For hive-style partitioned source data, partition columns synced with Hive will always have String type

2020-06-03 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-992: -- Summary: For hive-style partitioned source data, partition columns synced with Hive will always have String type Key: HUDI-992 URL: https://issues.apache.org/jira/browse/HUDI-992

[jira] [Assigned] (HUDI-1108) Allow parallel listing of dataset partitions for various actions during write

2020-07-17 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-1108: --- Assignee: Udit Mehrotra > Allow parallel listing of dataset partitions for various actions

[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2020-07-22 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163073#comment-17163073 ] Udit Mehrotra commented on HUDI-874: This has been fixed by EMR folks, but the fix will make it in

[jira] [Created] (HUDI-1102) Separate out Spark and Path detection utilities used in Bootstrap datasource work

2020-07-16 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-1102: --- Summary: Separate out Spark and Path detection utilities used in Bootstrap datasource work Key: HUDI-1102 URL: https://issues.apache.org/jira/browse/HUDI-1102 Project:

  1   2   3   4   5   6   7   8   >