[jira] [Issue Comment Deleted] (CARBONDATA-4055) Empty segment created and unnecessary entry to table status in update

2021-06-03 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal updated CARBONDATA-4055:

Comment: was deleted

(was: df.write.format("hudi").
  option(COMBINE_BEFORE_UPSERT_PROP, "false")
  option(PRECOMBINE_FIELD_OPT_KEY, "customerId").
  option(RECORDKEY_FIELD_OPT_KEY, "str_uuid").
  option(PARTITIONPATH_FIELD_OPT_KEY, "").
  option(DataSourceWriteOptions.OPERATION_OPT_KEY, "insert").
  option(DataSourceWriteOptions.HIVE_SYNC_ENABLED_OPT_KEY, "true").
  option(DataSourceWriteOptions.HIVE_PARTITION_FIELDS_OPT_KEY, "").
  option(DataSourceWriteOptions.HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY, 
"org.apache.hudi.hive.NonPartitionedExtractor").
  option(DataSourceWriteOptions.KEYGENERATOR_CLASS_OPT_KEY, 
"org.apache.hudi.keygen.NonpartitionedKeyGenerator").
  option(DataSourceWriteOptions.HIVE_DATABASE_OPT_KEY, db).
  option(DataSourceWriteOptions.HIVE_TABLE_OPT_KEY, tableName).
  option(TABLE_NAME, 
tableName).mode(Append).save(s"/hudicow6/${tableName}"))

> Empty segment created and unnecessary entry to table status in update
> -
>
> Key: CARBONDATA-4055
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4055
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Fix For: 2.1.1
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> When the update command is executed and no data is updated, empty segment 
> directories are created and an in progress stale entry added to table status, 
> and even segment dirs are not cleaned during clean files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (CARBONDATA-4055) Empty segment created and unnecessary entry to table status in update

2021-04-30 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal updated CARBONDATA-4055:

Comment: was deleted

(was: Akash
I am Apache carbondata PMC and Committer and Working as Senior Technical Lead 
at Cloud and AI/data platform team of Banglore Reasearch center, Huawei. I have 
been working on Bigdata and mainly Apache carbondata for 5 years now and have 
worked and interested in areas like index support on bigdata, Materialized 
Views, CDC on bigdata, Spark SQL query optimizations, Spark structured 
streaming, data lake and data warehouse functionality, trino.Currently I am 
working on Carbondata CDC.

kunal
I am Apache carbondata PMC and Committer and Working as System Architect at 
Cloud and AI/data platform team of Banglore Reasearch center, Huawei working on 
Bigdata technologies like Apache carbondata, Apache spark, Apache hive  for 5 
years now. Some of the major features include distributed index cache server, 
Hive + Carbondata integration, Pre-aggregation support, S3 support for 
carbondata, Secondary index on carbondata, Spark SQL query optimization in 
carbondata.
)

> Empty segment created and unnecessary entry to table status in update
> -
>
> Key: CARBONDATA-4055
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4055
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Fix For: 2.1.1
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> When the update command is executed and no data is updated, empty segment 
> directories are created and an in progress stale entry added to table status, 
> and even segment dirs are not cleaned during clean files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)