Github user sounakr commented on the issue:
https://github.com/apache/incubator-carbondata/pull/604
@jackylk : This problem will get reproduce in a multi node system where a
single Load will create multiple carbon data file is by multiple tasks. I have
tested the code changes in a multi node cluster with large data and output ic
coming correctly.
Another problem observed while testing is choosing the correct cardinality
of the new segment formed due to compaction. Previously we used to copy the
last segment cardinality to the new compacted segment. But with IUD features
updates can run on any segments and in case the updates happens on any
intermediate segments then there is high probability that cardinality of that
segment will be more than the last segment. So now instead of copying the
cardinality from the last segment we calculate the highest cardinility values
from all the segments.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---