[
https://issues.apache.org/jira/browse/CARBONDATA-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ajantha Bhat closed CARBONDATA-4106.
------------------------------------
Fix Version/s: (was: 2.0.1)
Resolution: Not A Bug
> Compaction is not working properly
> ----------------------------------
>
> Key: CARBONDATA-4106
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4106
> Project: CarbonData
> Issue Type: Improvement
> Components: core
> Affects Versions: 2.0.1
> Environment: Apache spark 2.4.5, carbonData 2.0.1
> Reporter: suyash yadav
> Priority: Major
> Attachments: describe_fact_probe_1
>
>
> Hi Team,
> We are using apache carbondata 2.0.1 for one of our POC and we observed that
> we are not getting proper benifit from using compaction (Both majour and
> minor).
> Please find below details for the issue we are facing:
> *Name of the table used*: fact_365_1_probe_1
> +*Number of rows:*
> +
> select count(*) from fact_365_1_probe_1
> +--------+
> |count(1)|
> +--------+
> |76963753|
> *Sample data from the table:*
> ======================
> +-------------------+--------------------------+------------------------------------+------------------+-------------+-------------------+
> | ts| metric| tags_id| value| epoch| ts2|
>
> +-------------------+--------------------------+------------------------------------+------------------+-------------+-------------------+
> |2021-01-07
> 21:05:00|Probe.Duplicate.Poll.Count|c8dead9b-87ae-46ae-8703-bc2b7bfba5d4|39.611356797970274|1610033757768|2021-01-07
> 00:00:00|
> |2021-01-07
> 23:50:00|Probe.Duplicate.Poll.Count|62351ef2-f2ce-49d1-a2fd-a0d1e5f6a1b9|
> 72.70658115131307|1610043742516|2021-01-07 00:00:00|
>
> [^describe_fact_probe_1]
>
> I have attached the describe output which will show you the other details of
> the table.
> The size of the table is 3.24 GB and even after running minor or majour
> compaction the size remain almost the same.
> So we re not getting any benifit by running the compaction.Could you please
> review the shared details and help us in identifying if we are missing
> something here or is there any bug?
> Also we need answer to the following questions about carbondata storate:
> 1. In case of decimal values, how the storage behaves like if i have one row
> with 20 digits after decimal and second row has only 5 digits after decimal
> so how and what would be the difference in the storage taken.
> 2. My second question is , if i have two tables and one of the table has same
> values for 100 rows and other table has different values for 100 rows so how
> carbon will behave as far as the storage is concerned in this scenario. WHich
> table will take less storage or both will take same storage.
> 3.Also for string datatype could you please describe what is the storage
> defined for string datatype.
>
> ================
--
This message was sent by Atlassian Jira
(v8.3.4#803005)