[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Wason updated HUDI-1779: - Fix Version/s: 0.14.1 (was: 0.14.0) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Ethan Guo >Priority: Blocker > Labels: pull-request-available > Fix For: 0.14.1 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-1779: Fix Version/s: 0.14.0 (was: 0.13.1) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Ethan Guo >Priority: Blocker > Labels: pull-request-available > Fix For: 0.14.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1779: -- Fix Version/s: (was: 0.12.3) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Ethan Guo >Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.1 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1779: - Fix Version/s: 0.12.3 > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Ethan Guo >Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.1, 0.12.3 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-1779: -- Fix Version/s: 0.13.1 (was: 0.13.0) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Ethan Guo >Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.1 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1779: - Sprint: 2022/08/22, 2022/09/05 (was: 2022/08/22, 2022/09/05, 2022/09/19) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Ethan Guo >Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1779: - Sprint: 2022/08/22, 2022/09/05, 2022/09/19 (was: 2022/08/22, 2022/09/05) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Ethan Guo >Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1779: - Sprint: 2022/08/22, 2022/09/05 (was: 2022/08/22) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Ethan Guo >Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-1779: Fix Version/s: (was: 0.12.1) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Alexey Kudinkin >Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-1779: Fix Version/s: 0.13.0 Priority: Blocker (was: Critical) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Alexey Kudinkin >Priority: Blocker > Labels: pull-request-available > Fix For: 0.12.1, 0.13.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-1779: Story Points: 2 > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Alexey Kudinkin >Priority: Critical > Labels: pull-request-available > Fix For: 0.12.1 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-1779: Epic Link: HUDI-1265 > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Alexey Kudinkin >Priority: Critical > Labels: pull-request-available > Fix For: 0.12.1 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-1779: -- Sprint: 2022/08/22 > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Alexey Kudinkin >Priority: Critical > Labels: pull-request-available > Fix For: 0.12.1 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-1779: -- Fix Version/s: 0.12.1 (was: 0.12.0) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Alexey Kudinkin >Priority: Critical > Labels: pull-request-available > Fix For: 0.12.1 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1779: - Priority: Critical (was: Major) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Alexey Kudinkin >Priority: Critical > Labels: pull-request-available > Fix For: 0.12.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1779: - Component/s: dependencies > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Alexey Kudinkin >Priority: Major > Labels: pull-request-available, query-eng, sev:high > Fix For: 0.12.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1779: - Labels: pull-request-available (was: pull-request-available query-eng sev:high) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies, spark >Reporter: lrz >Assignee: Alexey Kudinkin >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1779: - Fix Version/s: (was: 0.11.0) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: spark >Reporter: lrz >Assignee: Alexey Kudinkin >Priority: Major > Labels: pull-request-available, query-eng, sev:high > Fix For: 0.12.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1779: - Component/s: spark > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug > Components: spark >Reporter: lrz >Assignee: Alexey Kudinkin >Priority: Major > Labels: pull-request-available, query-eng, sev:high > Fix For: 0.11.0, 0.12.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1779: - Fix Version/s: 0.12.0 > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Assignee: Alexey Kudinkin >Priority: Major > Labels: pull-request-available, query-eng, sev:high > Fix For: 0.11.0, 0.12.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1779: -- Labels: pull-request-available query-eng sev:high (was: pull-request-available sev:high) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Assignee: Alexey Kudinkin >Priority: Major > Labels: pull-request-available, query-eng, sev:high > Fix For: 0.11.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-1779: - Fix Version/s: (was: 0.10.0) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Priority: Major > Labels: pull-request-available, sev:high > Fix For: 0.11.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-1779: Fix Version/s: (was: 0.9.0) 0.10.0 > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Priority: Major > Labels: pull-request-available, sev:high > Fix For: 0.10.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1779: -- Labels: pull-request-available sev:high (was: pull-request-available) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Priority: Major > Labels: pull-request-available, sev:high > Fix For: 0.9.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1779: - Labels: pull-request-available (was: ) > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lrz updated HUDI-1779: -- Attachment: upsertFail.png > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Priority: Major > Fix For: 0.9.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lrz updated HUDI-1779: -- Attachment: unsupportInt96.png > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Priority: Major > Fix For: 0.9.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lrz updated HUDI-1779: -- Attachment: upsertFail2.png > Fail to bootstrap/upsert a table which contains timestamp column > > > Key: HUDI-1779 > URL: https://issues.apache.org/jira/browse/HUDI-1779 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Priority: Major > Fix For: 0.9.0 > > Attachments: unsupportInt96.png, upsertFail.png, upsertFail2.png > > > current when hudi bootstrap a parquet file, or upsert into a parquet file > which contains timestmap column, it will fail because these issues: > 1) At bootstrap operation, if the origin parquet file was written by a spark > application, then spark will default save timestamp as int96(see > spark.sql.parquet.int96AsTimestamp), then bootstrap will fail, it’s because > of Hudi can not read Int96 type now.(this issue can be solve by upgrade > parquet to 1.12.0, and set parquet.avro.readInt96AsFixed=true, please check > [https://github|https://github/] > <[https://github/]>.com/apache/parquet-mr/pull/831/files) > 2) after bootstrap, doing upsert will fail because we use hoodie schema to > read origin parquet file. The schema is not match because hoodie schema > treat timestamp as long and at origin file it’s Int96 > 3) after bootstrap, and partial update for a parquet file will fail, because > we copy the old record and save by hoodie schema( we miss a > convertFixedToLong operation like spark does) -- This message was sent by Atlassian Jira (v8.3.4#803005)